|
Category
|
Node Name
|
Description
|
|
Audio
|
Audio Concat
|
The Audio Concat Node takes two or more separate audio streams and joins them chronologically. The output is a single audio file or stream where the second input begins immediately after the first ends.
Expects Inputs:
-Audio Audio 1: Primary clip.
-Audio Audio 2: Clip to append.
Provides Outputs:
-Audio Concatenated Audio: The full sequential stream.
|
|
Audio
|
Audio Gain
|
The Audio Gain Node scales the magnitude of an audio signal. The node multiplies every sample in the digital audio stream by a specific value.
Expects Inputs:
-Audio Audio: Initial clip.
-Float Gain: A gain of 0.5 reduces the signal by half (-6 dB), and a gain of 2.0 doubles it (+6dB).
Provides Outputs:
-Audio Audio with Gain: Scaled audio output.
|
|
Audio
|
Audio Mix
|
The Audio Mix Node combines multiple incoming audio signals into a single output stream by mathematically summing their waveforms.
Expects Inputs:
-Audio Audio 1: Initial clip.
-Audio Audio 2: A gain of 0.5 reduces the signal by half, and a gain of 2.0 doubles it.
-Float Mix: Crossfades the Amplitude between inputs using a center-weighted blend. At -1.0, Input 2 is fully attenuated. As the value moves toward 1.0, Input 1 fades out while Input 2 fades in. At 0.0, both signals are summed at equal volume.
-Float Audio 2 Delay (secs): Offsets the start time of the second input in seconds.
Provides Outputs:
-Audio Mixed Audio: Mixed audio file.
|
|
Audio
|
Audio Trim
|
Audio Trim extracts a specific segment of audio and defines its timing and volume fades within the timeline.
Expects Inputs:
-Audio Audio: The full audio stream.
-Float Start (secs): Indicates the timestamp of where in the source to begin.
-Float Duration (secs): The length of the segment to keep.
-Float Delay (secs): The offset/start time on the main timeline.
-Float Fade In (secs): Volume ramp duration at the start.
-Float Fade Out (secs): Volume ramp duration at the end.
Provides Outputs:
-Audio Trimmed Audio: The processed audio segment.
|
|
GenAI/3D
|
Meshy Image to 3D
|
The Meshy 3D Model Node converts 2D image inputs and text prompts into a 3D model with AI-generated texturing.
Expects Inputs:
-Image Model Image: The primary reference image for geometry generation.
-Image Alt Angle (optional): Up to three additional images to provide the AI with more perspective on the object.
-Integer Max Triangles: A cap on the mesh density (polygon count) to optimize performance.
-Text Texture Prompt: A text description guiding the look, material, and style of the surface.
-Image Texture Image: A reference image to guide specific patterns or material colors.
-Text Texture: Select No Texture to disable texture creation.
-Text Topology: Select either Triangle or Quad for topology.
-Text AI Model: The model to use.
Provides Outputs:
-Model-3d Generated Model: The resulting 3D model file, typically including vertex data and mapped textures.
|
|
GenAI/3D
|
Meshy Retopologizer
|
The Meshy Retopologizer Node rebuilds the polygon topology of a 3D model using AI. It replaces dense, irregular meshes (common in AI-generated or sculpted models) with cleaner, more efficient geometry suitable for animation, real-time rendering, or further production work.
Expects Inputs:
-Model-3D Model: The source 3D mesh to be retopologized.
-Integer Max Polygons: A cap on the total polygon count for the retopologized mesh.
-Float Height (meters): The target real-world height of the model in meters, used for correct scaling.
-Text Set Origin: Select Origin at the bottom to reposition the model's origin point to the bottom center of the bounding box, which is standard for placing characters on ground planes.
-Text Topology: Select either Triangle or Quad for topology.
Provides Outputs:
-Model-3d Retopologized Model: The rebuilt 3D model with optimized topology.
|
|
GenAI/3D
|
Meshy Rigger
|
The Meshy Rigger Node automatically generates a skeletal rig for a 3D model using AI, making it ready for animation. It analyzes the mesh geometry to place bones and joint hierarchies appropriate for the model's shape. Note that this process will strip any existing textures from the model.
Expects Inputs:
-Model-3D Model: The source 3D mesh to be rigged.
-Float Height (meters): The target real-world height of the model in meters, used to correctly scale the skeleton.
-Text Animate: Select Animate to apply a default animation cycle to the rigged model for immediate preview.
Provides Outputs:
-Model-3D Rigged Model: The 3D model with an embedded skeletal rig ready for animation.
|
|
GenAI/3D
|
Meshy Text to 3D
|
The Meshy Text to 3D Node generates a 3D model from a text description using AI. Unlike the image-based Meshy Node, this variant relies entirely on a written prompt to define the object's shape and appearance.
Expects Inputs:
-Text Prompt: A text description of the 3D object to generate (e.g., ‘a medieval wooden shield’).
-Integer Max Triangles: A cap on the mesh density (polygon count) to optimize performance.
-Text Texture Prompt: A text description guiding the look, material, and style of the surface.
-Image Texture Image: A reference image to guide specific patterns or material colors.
-Text Texture: Select No Texture to disable texture creation.
-Text Create Humanoid: Select Humanoid for bipedal character models when enabled.
-Text Topology: Select either Triangle or Quad for topology.
-Text AI Model: The model to use.
Provides Outputs:
-Model-3D Generated Model: The resulting 3D model file, typically including vertex data and mapped textures.
|
|
GenAI/3D
|
Meshy Textureizer
|
The Meshy Textureizer Node applies AI-generated textures to an existing 3D model. It takes a bare or previously textured mesh and re-skins it based on a text prompt and optional reference image, allowing rapid iteration on surface appearance without regenerating the geometry.
Expects Inputs:
-Model-3D Model: The source 3D mesh to be textured.
-Text Texture Prompt: A text description guiding the desired material, color, and surface style.
-Image Texture Image: A visual reference to guide the patterns or colors of the texture map.
-Text Original UVs: Select Generate UVs to allow the model to generate new UVs.
-Text AI Model: The model to use.
Provides Outputs:
-Model-3D Textured Model: The 3D model with newly generated texture maps applied.
|
|
GenAI/3D
|
Sharp Splat from Image
|
Create a Gaussian splat using the Apple Sharp model.
Expects Inputs:
-Image Image: The image to use for the Gaussian splat generation.
-Text Host: The host for the Sharp service.
Provides Outputs:
-Model-splat Generated Splat: The generated Gaussian splat.
|
|
GenAI/3D
|
Tripo Image to 3D
|
The Tripo 3D Model Node generates a 3D mesh and high-fidelity textures from a primary reference image and text-based style prompts.
Expects Inputs:
-Image Model Image: The primary reference for geometry generation.
-Image Alt Angle x3: Additional perspectives to improve 3D reconstruction accuracy.
-Integer Max Triangles: Sets the polygon limit for the generated mesh.
-Integer Seed: Sets a seed value for generation
-Texture: Set to true to inhibit texture generation
Provides Outputs:
-Model-3D Generated Model: The finalized 3D mesh data.
|
|
GenAI/3D
|
World Labs Image to Splat
|
Generates a Gaussian splat from an image using World Labs Marble.
Expects Inputs:
-Text Prompt (optional): Text prompt to guide the generation.
-Image Front Image: The front-facing source image.
-Image Right Image (optional): The right-facing reference image.
-Image Back Image (optional): The back-facing reference image.
-Image Left Image (optional): The left-facing reference image.
-Image Front Right (optional): The front-right facing reference image.
-Image Back Right (optional): The back-right facing reference image.
-Image Back Left (optional): The back-left facing reference image.
-Image Front Left (optional): The front-left facing reference image.
-Boolean Image Is Panorama: Whether the input image is a panoramic image.
-Integer Seed (optional): If set, will influence the determinism of the generation.
-String Model: Select the desired model.
Provides Outputs:
-Model-splat Generated World: The generated Gaussian splat world.
-Model-3D Collider Model: A 3D collider model for the generated world.
-Image Panorama Image: A panoramic image of the generated world.
|
|
GenAI/3D
|
World Labs Text to Splat
|
Generates a Gaussian splat from a text prompt using World Labs Marble.
Expects Inputs:
-Text Prompt: The text prompt describing the world to generate.
-Integer Seed (optional): If set, will influence the determinism of the generation.
-String Model: Select the desired model.
Provides Outputs:
-Model-splat Generated World: The generated Gaussian splat world.
-Model-3D Collider Model: A 3D collider model for the generated world.
-Image Panorama Image: A panoramic image of the generated world.
|
|
GenAI/3D
|
World Labs Video to Splat
|
Generates a Gaussian splat from a video using World Labs Marble.
Expects Inputs:
-Text Prompt (optional): Text prompt to guide the generation.
-Video Video: The source video to generate the world from.
-Integer Seed (optional): If set, will influence the determinism of the generation.
-String Model: Select the desired model.
Provides Outputs:
-Model-splat Generated World: The generated Gaussian splat world.
-Model-3D Collider Model: A 3D collider model for the generated world.
-Image Panorama Image: A panoramic image of the generated world.
|
|
GenAI/Beeble
|
Beeble SwitchX
|
Performs video-to-video generation using the Beeble AI SwitchX model.
Expects Inputs:
-Video Video: Source video to be modified.
-Text Prompt: Prompt guiding the video modification. Either Prompt or Reference Image is required.
-Image Reference Image: Image guiding the video modification. Either Prompt or Reference Image is required. Also, required if the Alpha Mode is set to Select Alpha mode.
-Video Alpha Video: A video of the alpha (matte) channel, required if the Alpha Mode is set to Custom Alpha mode.
-Text Max Resolution: The desired resolution of the generated video.
Provides Outputs:
-Video Generated Video: The generated content video.
-Video Generated Alpha: The generated alpha video.
|
|
GenAI/ElevenLabs
|
Voice Aggregator
|
Utility node to assist in processing ElevenLabs nodes that output both an audio sample and an ID. It is a pass-through node.
Expects Inputs:
-Audio Audio: The audio file.
-Text ID: The ID text.
Provides Outputs:
-Audio audio: The audio file.
-Text ID: The ID text.
|
|
GenAI/ElevenLabs
|
Voice Changer
|
Transforms audio from one voice to another. Maintain full control over emotion, timing, and delivery using the ElevenLabs Speech-to-Speech API.
Expects Inputs:
-Audio Input Audio: The source audio with the voice to be changed.
-Text Voice ID: The Voice ID text from ElevenLabs voices, or a Voice ID created with the ElevenLabs Voice Creator node.
-JSON Settings: Optional additional voice settings as JSON, see ElevenLabs for examples.
-Integer Seed: Optional GenAI seed to guide repeatable generations.
-Boolean Remove Noise: Set to Remove Noise to automatically remove background noise.
-Text Model: Model name to use.
-Text Output Format: The desired output format, in the form of Format, Sample Rate, Bitrate.
Provides Outputs:
-Audio Generated Audio: The audio with the voice changed to the voice defined by the Voice ID.
|
|
GenAI/ElevenLabs
|
Voice Creator
|
Create a voice from a previously generated voice preview using the ElevenLabs Create a voice API.
Expects Inputs:
-Text Voice Name: A unique name for the voice that will be created.
-Text Voice Descrip: A description of the new voice.
-Text Gen Voice ID: A Generated Voice ID from the ElevenLabs Voice Designer node.
-JSON Labels: Optional text-to-text map of desired labels for the voice.
Provides Outputs:
-Text Voice ID: A newly minted Voice ID that can be used with the ElevenLabs Voice Changer Node.
-Audio Preview Audio: Example audio with the new voice.
-Text Name: The name of the voice that was created.
-Text Category: The category that the created voice is in.
|
|
GenAI/ElevenLabs
|
Voice Designer
|
Design a voice via a prompt using the ElevenLabs Voice Design API.
Expects Inputs:
-Text Voice Descrip: A detailed description of the voice to be created, including elements like intonation, pacing, quality, etc.
-Text Text: Optional text for the created voice to read. If not provided, ElevenLabs will generate sample text automatically.
-Audio Ref Audio: Optional reference audio, may only be used with the Eleven TTV v3 model.
-Float Loud: Optional loudness setting, controls the volume level of the generated voice. -1 is quietest, 1 is loudest, 0 corresponds to roughly -24 LUFS.
-Integer Seed: Optional GenAI seed to guide repeatable generations.
-Float Guidance: Optional Guidance Scale influencing how closely the voice adheres to the prompt, 0 being very loose and 100 being very tight.
-Text Remix ID: Optional Remixing Session ID as generated by the ElevenLabs Remixing node.
-Text Remix Iter ID: Optional Remixing iteration value, when iterating on remixes.
-Float Quality: Optional value between -1 and 1, trading off variability and quality, with -1 being highly variable and 1 being high quality.
-Float Strength: Optional value between 0 and 1, influencing how strongly the voice adheres to the description.
-Boolean Enhance: Set to Enhance to pre-process the reference audio.
-Text Model: Model name to use.
-Text Output Format: The desired output format, in the form of Format, Sample Rate, Bitrate.
Provides Outputs:
-Text Gen Voice ID: The Generate Voice ID that can be used with the ElevenLabs Voice Creator node.
-Audio Generated Audio: A sample audio created with the generated voice.
|
|
GenAI/ElevenLabs
|
Voice Remixer
|
Remix an existing voice via a prompt using the ElevenLabs Voice Remix API.
Expects Inputs:
-Text Voice ID: An ElevenLabs Voice ID on which to base the remixed voice.
-Text Voice Descript: A detailed description of the voice to be created, including elements like intonation, pacing, quality, etc.
-Text Text: Optional text for the created voice to read. If not provided, ElevenLabs will generate sample text automatically.
-Float Loud: Optional loudness setting, controls the volume level of the generated voice. -1 is quietest, 1 is loudest, 0 corresponds to roughly -24 LUFS.
-Integer Seed: Optional GenAI seed to guide repeatable generations.
-Float Guidance: Optional Guidance Scale influencing how closely the voice adheres to the prompt, 0 being very loose and 100 being very tight.
-Text Remix ID: Optional Remixing Session ID as generated by the ElevenLabs Remixing node.
-Text Remix Iter ID: Optional Remixing iteration value, when iterating on remixes.
-Float Strength: Optional value between 0 and 1, influencing how strongly the voice adheres to the description.
-Text Output Format: The desired output format, in the form of Format, Sample Rate Bitrate.
-Text Gen Voice ID: The Generate Voice ID that can be used with the ElevenLabs Voice Creator node.
-Audio Generated Audio: A sample audio created with the generated voice.
|
|
GenAI/Google
|
Chirp 3 Custom Voice TTS
|
Synthesizes speech using a custom cloned voice via Google's Chirp 3 API.
Expects Inputs:
-Text What to Say: The text you want spoken.
-Text Voice Cloning Key: The secret key for your cloned custom voice.
-Text Language Code: The language you want the speech generated in.
Provides Outputs:
-Audio Generated Audio: The synthesized audio stream.
|
|
GenAI/Google
|
Chirp 3 Custom Voice
|
Generates a voice cloning key using Google's Chirp 3 Instant Custom Voice.
Expects Inputs:
-Audio Reference Audio: An audio file containing the target voice to clone, about 10 seconds required. An example script is, 'To build a great synthetic voice, we need to capture a wide variety of sounds. From the sharpest consonants to the smoothest vowels, every single syllable matters.'
-Audio Consent Audio: An audio file containing the consent statement read by the voice actor, about 10 seconds required. The script is, 'I am the owner of this voice, and I consent to Google using this voice to create a synthetic voice model.'
-Text Language Code: The language of the audio (e.g., en-us).
Provides Outputs:
-Text Voice Cloning Key: The generated cloning key text, which can be passed as a voice_name to TTS nodes.
|
|
GenAI/Google
|
Chirp 3 HD TTS
|
Synthesizes ultra-realistic, emotionally resonant speech using Google's generative Chirp 3 HD Voices API.
Expects Inputs:
-Text What to Say: The text to be converted to speech.
-Text Voice Name: The exact name of the HD voice model.
-Text Language Code: The language code mapping to the selected voice.
Provides Outputs:
-Audio Generated Audio: The synthesized audio stream.
|
|
GenAI/Google
|
Gemini Flash2.5 Image
|
Generates high-quality images from text descriptions and optional reference images using the Gemini Flash 2.5 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired image.
-Image Ref Image (optional) x2: Visual references that the AI uses to influence the style, structure, or content of the new generation.
-Text / Enum Aspect Ratio: The framing dimensions for the output (e.g., 1:1, 16:9, 9:16).
-Integer Seed: A specific number used to initialize the generation; using the same seed with the same prompt will produce the same result.
Provides Outputs:
-Image Generated Image: The final AI-generated image file.
|
|
GenAI/Google
|
Gemini Flash2.5 Isolate
|
Leverages vision-language models to identify and segment specific objects from an image based on a text query.
Expects Inputs:
-Text Item to Isolate: A natural language description of the object you want to extract (e.g., 'the blue coffee mug').
-Image Image: The source image containing the item.
Provides Outputs:
-Image Isolated Image: A smaller image file cropped in a square around the prompted item.
-Image Isolated Mask: A mask representation of that image.
|
|
GenAI/Google
|
Gemini Flash2.5 Segment
|
Leverages vision-language models to identify and segment specific objects from an image based on a text query.
Expects Inputs:
-Text Item to Segment: A natural language description of the object you want to extract (e.g., ‘the blue coffee mug’).
-Image Image: The source image containing the item.
-Integer Mask Threshold: Controls the tightness of the resulting mask.
Provides Outputs:
-Image Segmented Image: The segmented image.
-Image Segmented Mask: The segmentation mask.
|
|
GenAI/Google
|
Gemini Flash2.5 Text
|
Processes natural language prompts to generate creative copy, analyze data, or describe visual inputs. It can utilize reference images to provide high-context answers or follow specific Agent Instructions for tailored personas.
Expects Inputs:
-Text Prompt: The primary text instruction or question for the AI.
-Image Ref Image (optional) x2: Visual context for the AI to ‘look at’ when generating its response.
-Text Agent Instructions: High-level system instructions to define the AI’s behavior, tone, or specific formatting requirements.
-Integer Seed: A numerical value used to ensure reproducible text outputs.
Provides Outputs:
-Text Generated Text: The finalized text generated by the model.
|
|
GenAI/Google
|
Gemini Flash2.5 Transcribe
|
Utilizes Gemini’s multimodal capabilities to listen to audio streams and generate highly accurate transcriptions. It can distinguish between different voices and provide precise timing for when each word or sentence was spoken.
Expects Inputs:
-Audio Audio: The source sound file or stream to be transcribed.
-Boolean Include Timestamps: A toggle to determine if the output should include start and end times for the transcribed text.
-Boolean Include Speakers: A toggle to enable diarization (identifying and labeling different speakers in the audio).
Provides Outputs:
-Text Generated Text: The finalized transcription text, formatted based on the input settings.
|
|
GenAI/Google
|
Gemini Flash2.5 Text-to-Speech
|
Transforms text into synthetic speech.
Expects Inputs:
-Text What to Say: The actual text content to be converted into speech.
-Text voice or voice1, voice...: Specifies the desired voice model (en-US-Studio-O, en-US-Neural2-D) or a prioritized list of voices to use for the generation.
-Text Language Code: The BCP-47 language tag (e.g., en-US, fr-FR) to ensure correct pronunciation and accent.
Provides Outputs:
-Audio Generated Audio: The generated synthetic speech audio stream.
|
|
GenAI/Google
|
Gemini Flash3.1 Image
|
Generates high-quality images from text descriptions and optional reference images using the Gemini Flash 3.1 model.
Expects Inputs:
-Text Prompt: Detailed text description of the image to be generated.
-Image Ref Image (optional) x14: Multiple visual references used to guide the AI on style, layout, or specific details.
-Text Aspect Ratio: Defines the frame dimensions (e.g., 16:9, 1:1, 9:16).
-Text Size (512, 1K, 2K, or 4K): Determines the output resolution and detail density of the final image.
-Integer Seed: A numerical value to ensure reproducible results or for iterative tweaking of a specific generation.
Provides Outputs:
-Image Generated Image: The finalized AI-generated image asset.
|
|
GenAI/Google
|
Gemini Pro3.1 Media Inspector
|
Utilizes Gemini 3.1 Pro capabilities to inspect multiple forms of media (up to 8 images, 4 videos, and 1 audio file) and provide a text response based on a prompt.
Expects Inputs:
-Text Prompt: Text Prompt (required).
-Image Images: Up to 8 images.
-Video Videos: Up to 4 videos.
-Audio Audio: 1 audio file.
Provides Outputs:
-Text Generated Text: The response from the model.
|
|
GenAI/Google
|
Gemini Pro3.1 Text
|
Processes natural language prompts to generate creative copy, analyze data, or describe visual inputs.
It can utilize reference images to provide high-context answers or follow specific Agent Instructions for tailored personas.
Expects Inputs:
-Text Prompt: The primary text instruction or question for the AI.
-Image Ref Image (optional) x2: Visual context for the AI to ‘look at’ when generating its response.
-Text Agent Instructions: High-level system instructions to define the AI’s behavior, tone, or specific formatting requirements.
-Integer Seed: A numerical value used to ensure reproducible text outputs.
Provides Outputs:
-Text Generated Text: The AI-generated text response.
|
|
GenAI/Google
|
Gemini Pro3.0 Audio Inspector
|
Utilizes Gemini’s multimodal capabilities to listen to audio streams and process as requested in the prompt.
Expects Inputs:
-Text Prompt: The prompt to be sent to the model.
-Audio Audio: The source sound file or stream to be processed.
Provides Outputs:
-Text Generated Text: The finalized response from the model.
|
|
GenAI/Google
|
Gemini Pro3.0 Image
|
Generates high-quality images from text descriptions and optional reference images using the Gemini Pro 3.0 model.
Expects Inputs:
-Text Prompt: Detailed text description of the image to be generated.
-Image Ref Image (optional) x6: Multiple visual references used to guide the AI on style, layout, or specific details.
-Text Aspect Ratio: Defines the frame dimensions (e.g., 16:9, 1:1, 9:16).
-Text Size (1K, 2K, or 4K): Determines the output resolution and detail density of the final image.
-Integer Seed: A numerical value to ensure reproducible results or for iterative tweaking of a specific generation.
Provides Outputs:
-Image Generated Image: The finalized AI-generated image asset.
|
|
GenAI/Google
|
Gemini Pro3.0 Text
|
Processes natural language prompts to generate creative copy, analyze data, or describe visual inputs. It can utilize reference images to provide high-context answers or follow specific agent instructions for tailored personas.
Expects Inputs:
-Text Prompt: The primary text instruction or question for the AI.
-Image Ref Image (optional) x2: Visual context for the AI to 'look at' when generating its response.
-Text Agent Instructions: High-level system instructions to define the AI’s behavior, tone, or specific formatting requirements.
-Integer Seed: A numerical value used to ensure reproducible text outputs.
Provides Outputs:
-Text Generated Text: The finalized text generated by the model.
|
|
GenAI/Google
|
Imagen 3.0 Mask Inpaint
|
Allows for detailed image editing by painting new content into a defined area of an existing image. It utilizes a background image as a base and a mask to designate which pixels the AI should regenerate based on the provided description.
Expects Inputs:
-Text Describe Inpainting: A text prompt detailing exactly what should be generated within the masked area.
-Image Background Image: The original source image that serves as the base for the edit.
-Image Mask Image (optional): A grayscale or alpha mask where white/opaque areas indicate the region to be changed and black/transparent areas remain untouched.
Provides Outputs:
-Image Inpainted Image: The finalized image with the specified region inpainted.
|
|
GenAI/Google
|
Imagen 3.0 Mask Outpaint
|
Used for generative expansion of an image (uncropping), allowing artists to create larger scenes while maintaining the style and content of the original background.
Expects Inputs:
-Text Describe Outpainting: A text prompt detailing what should be generated in the expanded areas.
-Image Background Image: The source image to be expanded.
-Image Mask Image (optional): Defines the specific area outside the original image boundaries to be generated.
-Boolean Blend Original: A toggle to enable or disable feathered blending between the source image and the newly generated content.
Provides Outputs:
-Image Outpainted Image: The finalized, expanded image asset.
|
|
GenAI/Google
|
Imagen 4.0 Super-Res
|
Increases the pixel dimensions of an input image while intelligently reconstructing textures and edges. This is essential for bringing low-resolution generations or cropped assets up to production standards (2K/4K) without the blurring typical of standard bicubic upscaling.
Expects Inputs:
-Image Input Image: The source image to be upscaled.
-Integer Scale (2, 3, or 4): The multiplication factor for the resolution (e.g., a 2 scale turns a 1080p image into 4K).
Provides Outputs:
-Image Super-Res Image: The high-resolution, enhanced version of the source image.
|
|
GenAI/Google
|
Lyria2.0 Music
|
Creates high-quality 48kHz stereo audio compositions by translating natural language descriptions into fully realized musical pieces.
Expects Inputs:
-Text Music Prompt: A descriptive text in English detailing the desired genre, mood, instrumentation, and style (e.g., ‘A calm acoustic folk song with a gentle guitar melody’).
-Text Negative Prompt: A description of specific elements, such as vocals, drums, or fast tempo, that the model should exclude from the generated audio.
-Integer Seed: A specific value used for deterministic generation, ensuring that the same prompt and parameters produce the same audio output.
Provides Outputs:
-Audio Generated Music: A generated instrumental audio track, typically up to 30 seconds in length, provided in a high-quality format like WAV.
|
|
GenAI/Google
|
Lyria 3 Music
|
Generates high-quality audio using Google Lyria 3 REST API. Provide a text prompt, an optional image URI, and choose a lyria-3 model.
Expects Inputs:
-Text Music Prompt: The prompt guiding the generation of music.
-Image Image: Optional reference image to guide the generation of music.
-String Model: The Lyria model to use.
Provides Outputs:
-Audio Generated Music: The generated music.
-Text Lyrics: Any lyrics generated in the music.
-Text Description: A description of the generated music.
|
|
GenAI/Google
|
Veo3
|
Produces high-quality video sequences from natural language descriptions using Veo3.
Expects Inputs:
-Text Prompt: A detailed description of the scene, including character actions, lighting, and camera movement.
-Image Image (optional): A reference image used to guide the initial frame, visual style, or character design for the video generation.
-Text Negative Prompt: Specific elements, visual styles, or camera behaviors to be excluded from the generated sequence.
-Boolean Portrait: A toggle to switch between standard widescreen (landscape) and vertical (portrait) aspect ratios for mobile-optimized delivery.
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
Provides Outputs:
-Video Generated Video: The finalized cinematic video file with natively generated audio.
|
|
GenAI/Google
|
Veo3.1
|
Uses advanced generative AI to create smooth, natural video sequences that bridge two different images representing the first and last frame.
Expects Inputs:
-Text Prompt: A text description detailing the action, lighting, and style of the transition to guide the AI's interpolation.
-Image Start Image (optional): The starting reference frame that defines the beginning of the video sequence.
-Image End Image (optional): The ending reference frame that defines the final state of the video sequence.
-Text Negative Prompt: Elements or styles to be explicitly excluded from the generated transition.
-Boolean Portrait: A toggle to switch between widescreen (16:9) and vertical (9:16) aspect ratios.
-Float Duration (secs): Requested video duration (may not be respected in certain combinations of options).
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
-Text Resolution: Desired resolution for the video, 720p, 1080p, or 4k.
-Text Model: Veo model to use, Standard, Fast, or Lite (note: 4k is not supported with the Lite model).
-Text Audio: Select No Audio to suppress audio generation.
Provides Outputs:
-Video Generated Video: The finalized video clip (typically 8 seconds) featuring smooth motion and natively generated audio.
|
|
GenAI/Google
|
Veo3.1 Ref Images
|
Generates cinematic video sequences while maintaining strict subject, character, or style consistency through multiple visual references.
Expects Inputs:
-Text Prompt: Detailed instructions on action, lighting, and camera movement.
-Image Ref Image (optional) x3: Upload up to three images of a specific subject to help the AI preserve their identity.
-Text Negative Prompt: Descriptive text for elements to exclude from the generation.
-Boolean Portrait: Toggle for vertical 9:16 or widescreen 16:9 aspect ratio.
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
-Text Resolution: Desired resolution for the video, 720p, 1080p, or 4k.
-Text Model: Veo model to use, Standard, or Fast.
-Text Audio: Select No Audio to suppress audio generation.
Provides Outputs:
-Video Generated Video: The finalized 8-second cinematic video featuring smooth motion and natively generated audio.
|
|
GenAI/Google
|
Veo3.1 Video Extension
|
Uses generative AI to lengthen existing video clips by synthesizing new, contextually relevant frames based on a text prompt.
Expects Inputs:
-Text Prompt: A text description of the action or visual elements you want to see in the newly generated extension.
-Video Video: The source video clip that provides the starting context for the extension.
-Text Negative Prompt: A text description of elements you want the AI to specifically avoid in the generated footage (e.g., ‘blur’, ‘distorted faces’).
-Boolean Portrait: A toggle to define the aspect ratio; when enabled, the model optimizes for vertical mobile-first content rather than landscape cinematic content.
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
-Text Model: Veo model to use, Standard, Fast, or Lite.
Provides Outputs:
-Video Generated Video: An extended version of the input video containing the original footage followed by the newly synthesized AI sequence.
|
|
GenAI/Sync
|
Sync Lipsync
|
Generates a lip-synced video using the Sync API. Uploads a source video and audio track, then uses AI to synchronize the lip movements in the video to match the provided audio.
Expects Inputs:
-Video Source Video: The video containing the face(s) to lip-sync. Must be under 20 MB.
-Audio Audio: The audio track to sync to the video. Must be under 20 MB.
-Text Prompt: Optional text prompt to guide the animation.
-Float Temperature: The temperature to use for the generation.
-Text Model: The model to use for the generation.
-Text Sync Mode: How to handle length differences between video and audio (Loop, Freeze, or Trim).
-Text Model Mode: The model mode to use for the generation (react-1 only).
Provides Outputs:
-Video Generated Video: The lip-synced output video.
|
|
GenAI/Topaz
|
Topaz Video Enhancement Resolution
|
This node is designed to take a source video and increase its quality and pixel count using Topaz.
Dropdowns (Top to Bottom):
-Codec [AV1, H264, H265, VP9].
-Output File Format (Auto, .mkv, .mp4, .mov).
-Audio Codec (AAC, AC3, PCM).
-Processing Mode (Copy, Convert, None).
Expects Inputs:
-Video Video Data: The source video file to be enhanced.
-Integer Width: The target horizontal resolution in pixels.
-Integer Height: The target vertical resolution in pixels.
For 4K Astra, use 3840x2160 (Landscape) or 2160x3840 (Portrait). This establishes a bounding box, and the aspect ratio is preserved.
-Text Enhancement Filter: The specific AI model used to reconstruct textures and remove artifacts. Use the Enhancement Filter Node to structure JSON.
-Text Frame Interpolation: The model used to generate intermediate frames for smoother motion. Use the Interpolation Filter Node to structure JSON.
-Text Astra Overrides: Sets paid diffusion flag needed for Astra models slf-1 and slc-1.
Provides Outputs:
-Video Generated Video: The finalized, enhanced, and upscaled video file.
|
|
GenAI/Topaz
|
Topaz Video Enhancement Filter
|
Formats settings into JSON for connection to Topaz Video Enhancement.
Expects Inputs:
-Text Model: Selects the core AI architecture (e.g., ahq-12, proteus, artemis). High-quality models like ahq are best for high-bitrate sources.
-Text Video Type: Defines the source video structure. Progressive is standard for digital/AI video; Interlaced is for older broadcast/tape footage.
-Text Auto: Toggles sub-algorithms between Auto (AI-determined) and Manual (user-defined) for the sliders below.
-Text Field Order: For interlaced video, where each frame is split into two fields of alternating lines.
-Text Focus Fix Level: Repairs slightly out-of-focus footage by downscaling to find a sharper base, then upscaling back.
-Text Creativity: Low (default) or high. Only applies to the slc-1 Starlight Creative Astra model.
-Float Compression (-1.0 to 1.0): Removes blocky artifacts from low-bitrate or crunchy source videos.
-Float Details (0.0 to 0.1): Reconstructs micro-textures like skin pores, fabric weaves, or fine foliage.
-Float Pre-noise (0.0 to 0.1): How much original sensor noise to ignore before enhancement begins.
-Float Noise (-1.0 to 1.0): Targets removal or preservation of luminance and chroma noise.
-Float Halo (-1.0 to 1.0): Reduces ringing or white outlines on edges from over-sharpened source material.
-Float Pre-blur (-1.0 to 1.0): Softens harsh edges or pre-processes pixelated footage so the AI can read shapes better.
-Float Blur (-1.0 to 1.0): General blur control.
-Float Grain (0.0 to 1.0): AI grain intensity to prevent an unnaturally smooth look.
-Integer Grain (0 to 5): Grain particle size for filmic vs. digital aesthetic.
-Float Recover Original Detail (0.0 to 1.0): Blends original unprocessed frames back to maintain a natural appearance.
Provides Outputs:
-Text Enhancement Filter: The formatted filter configuration.
-Text Astra Overrides: Additional Astra model overrides.
|
|
GenAI/ComfyUI
|
ComfyUI Workflow Bridge
|
Sends image, video, and text data to a user-provided ComfyUI workflow JSON.
Unlike the standard ComfyUI Bridge, which fetches the workflow from the connected ComfyUI instance, this node accepts a workflow JSON via connection (e.g., from a Load Text Node).
Requires the Nodey Bridge extension installed in ComfyUI.
Expects Inputs:
-Image Image (optional) Image data to send to ComfyUI.
-Video Video (optional): Video data to send to ComfyUI.
-Text Text (optional): Text/prompt data to send to ComfyUI.
-Text Workflow JSON: Connect a Load Text Node with the workflow in API format – exported via Save (API Format) in ComfyUI.
-Text Nodey ID: Unique identifier matching a NodeyInput Node in the workflow.
Provides Outputs:
-Image Result Image: First output image from ComfyUI workflow.
-Video Result Video: First output video/GIF from ComfyUI workflow.
-Text Result Text: Text output from ComfyUI workflow (if any).
|
|
GenAI/ComfyUI
|
ComfyUI Bridge
|
Sends image, video, and text data to ComfyUI workflows.
Requires the Nodey Bridge extension installed in ComfyUI.
Automatically fetches the current workflow from the connected ComfyUI instance.
Data is mapped to NodeyInput Nodes by matching the nodey_id.
Expects Inputs:
-Image Image (optional): Image data to send to ComfyUI.
-Video Video (optional): Video data to send to ComfyUI.
-Text Text (optional): Text/prompt data to send to ComfyUI.
-Text Nodey ID: Unique identifier matching a NodeyInput Node in ComfyUI.
Provides Outputs:
-Image Result Image: First output image from ComfyUI workflow
-Video Result Video: First output video/GIF from ComfyUI workflow
-Text Result Text: Text output from ComfyUI workflow (if any).
|
|
IO
|
Screenshare Output
|
Captures screen content as image frames or video recordings.
Uses the browser's Screen Capture API to record from a selected display, window, or browser tab.
Requires the user to establish a configuration for screen capture, including source selection (display, window, or tab) and capture settings.
Provides Outputs:
-Image Captured Frame: Single image frame from the screen capture.
-Video Captured Video: Video recording of the screen capture session.
|
|
GenAI/Topaz
|
Topaz Video Frame Interpolation Filter
|
This node creates a structured JSON configuration for temporal enhancements.
Expects Inputs:
-Text Model: Selects the AI architecture for motion estimation (e.g., apo-8, chronos, apollo). apo models are optimized for fast-moving action and high-accuracy slow motion.
-Integer Slowmo (1 to 16): Sets the deceleration factor. A value of 2 doubles the number of frames (half speed), while 16 creates extreme slow motion from standard footage.
-Integer FPS (15 to 240): Sets the target frames per second. The AI will generate exactly enough frames to reach this specific playback speed regardless of the source.
-Text Duplicate Frames: Toggles True/False detection and replacement of repeated frames, common in traditional animation or low-bandwidth web video.
-Integer Duplicate Threshold (0.001 to 0.1): Sensitivity for duplicate detection. Higher values make the AI more aggressive in identifying slightly varying frames as duplicates.
Provides Outputs:
-Text Frame Interpolation Filter: The formatted interpolation configuration.
|
|
Image
|
Cube Map to Images
|
Converts a standard flat cube map image to the individual image planes.
Expects Inputs:
-Image Cubemap Image: The standard flat cube map image.
Provides Outputs:
-Image Front Image: The front-facing image.
-Image Left Image: The left-facing image.
-Image Back Image: The back-facing image.
-Image Right Image: The right-facing image.
-Image Top Image: The top-facing image.
-Image Bottom Image: The bottom-facing image.
|
|
Image
|
Cube to Panorama
|
Converts a set of 4 cube face images to a cylindrical projected panoramic image.
Expects Inputs:
-Image Front Image: The front-facing image.
-Image Left Image: The left-facing image.
-Image Back Image: The back-facing image.
-Image Right Image: The right-facing image.
Provides Outputs:
-Image Panoramic Image: The panoramic image as a cylindrical projection of the cube faces provided.
|
|
Image
|
Cube to Sphere
|
Converts a set of 6 cube face images to an equirectangular projected spherical image.
Expects Inputs:
-Image Front Image: The front-facing image.
-Image Left Image: The left-facing image.
-Image Back Image: The back-facing image.
-Image Right Image: The right-facing image.
-Image Top Image: The top-facing image.
-Image Bottom Image: The bottom-facing image.
Provides Outputs:
-Image Equirectangular Image: The equirectangular projected spherical image.
|
|
Image
|
Image Atlas
|
Aggregates multiple individual images into a single tiled composite or grid-based reference sheet.
Expects Inputs:
-Image Image 1-9: The source images to be tiled in the grid.
-Text Color: Color: Defines the background or border color between the tiled images.
Provides Outputs:
-Image Image Atlas: The final combined grid image.
|
|
Image
|
Image Blur
|
Applies professional-grade softening effects to an image with support for asymmetric and box-style blurring.
Expects Inputs:
-Image Image: The source image to be blurred.
-Float Blur Radius: The primary intensity of the effect; larger values result in a more out-of-focus image.
-Float Asym Vert Blur (optional): Allows for independent control over vertical blurring to create anamorphic-style streaks or motion-blur effects.
-Boolean Box (not Gaussian): A toggle to switch from a smooth Gaussian curve to a faster, more linear Box blur algorithm, which can produce more stylized, angular edges.
Provides Outputs:
-Image Blurred Image: The blurred image.
|
|
Image
|
Image Color Correction
|
This node provides a streamlined interface to modify the tonal and color characteristics of an image. It is essential for matching AI-generated elements with real-world footage or correcting lighting inconsistencies within a workflow.
Expects Inputs:
-Image Image: The source image to be color corrected.
Provides Outputs:
-Image Edited Image: The color-corrected image.
|
|
Image
|
Image Composite
|
This node is the primary tool for layered image construction. It allows you to place a foreground element (Overlay) onto a base layer (Background) using a mask to precisely control which parts of the overlay are visible.
Expects Inputs:
-Image Background Image: The base layer that serves as the foundation of the composite.
-Image Overlay Image: The foreground element to be placed on top of the background.
-Image Overlay Mask Image: A grayscale image that determines the transparency of the Overlay; white areas are fully visible, while black areas are hidden.
Provides Outputs:
-Image Composited Image: The finalized composite image with the layers successfully merged.
|
|
Image
|
Image Crop
|
Extracts a specific rectangular sub-section of an image by defining precise pixel offsets and dimensions.
Expects Inputs:
-Image Image: The source image to be cropped.
-Integer Left: The horizontal starting point (X-coordinate) for the crop, measured in pixels from the left edge.
-Integer Top: The vertical starting point (Y-coordinate) for the crop, measured in pixels from the top edge.
-Integer Width: The horizontal length of the final cropped area.
-Integer Height: The vertical length of the final cropped area.
Provides Outputs:
-Image Cropped Image: Cropped Image.
|
|
Image
|
Image Crop w/ UI
|
Extracts a specific rectangular sub-section of an image through a dedicated graphical interface.
Expects Inputs:
-Image Image: The source image to be cropped.
Provides Outputs:
-Image Cropped Image: The cropped image.
|
|
Image
|
Image Details
|
This node acts as an inspector that reads an image file and converts its internal properties into individual data streams.
Expects Inputs:
-Image Image: The source image asset you wish to analyze.
Provides Outputs:
-Integer Width: The horizontal resolution of the image in pixels (e.g., 1400).
-Integer Height: The vertical resolution of the image in pixels (e.g., 1024).
-Integer Channels: The number of color channels (e.g., 3 for standard RGB, 4 if there is an Alpha/Transparency channel).
-Text Format: The file extension or encoding type of the image (e.g., jpeg).
|
|
Image
|
Image FlipFlop
|
Mirrors an image vertically (Flip), horizontally (Flop), or both (FlipFlop).
Expects Inputs:
-Image Image: The source image to transform.
-Text Mode: The mirroring direction — Flip mirrors vertically (top ↔ bottom), Flop mirrors horizontally (left ↔ right), FlipFlop applies both.
Provides Outputs:
-Image Flipped Image: The mirrored image.
|
|
Image
|
Image Grey Convert
|
This node is used to strip color information from an image, converting it into a single-channel grayscale representation. It is essential for creating luminance masks, preparing images for specific AI depth-analysis models, or achieving a classic black-and-white aesthetic.
Expects Inputs:
-Image Image: The source image to be converted.
Provides Outputs:
-Image Converted Image: The grayscale image.
|
|
Image
|
Image Invert
|
This node mathematically flips the color information of an input image (e.g., changing white to black, red to cyan, etc.). It’s a fundamental utility for mask manipulation, allowing you to quickly swap the active and inactive areas of a transparency map.
Expects Inputs:
-Image Image: The source image or mask to be inverted.
-Boolean Also Invert Mask: A toggle that determines if the alpha channel (transparency) should also be flipped along with the RGB color data.
Provides Outputs:
-Image Inverted Image: The finalized inverted image asset.
|
|
Image
|
Image Mask
|
This node applies masking data to a source image. It allows you to define which parts of an image should be visible or hidden based on a secondary mask.
Expects Inputs:
-Image Image: The primary source image intended to be masked.
-Image Mask (optional): A secondary image asset (typically grayscale) used to define transparency; white pixels generally represent opacity, while black pixels represent transparency.
Provides Outputs:
-Image Masked Image: The resulting image asset with the mask applied to its alpha channel.
|
|
Image
|
Image Merge
|
Reconstructs a full-color image by combining four independent grayscale channel maps.
Expects Inputs:
-Image Red Channel: A grayscale image used to define the intensity of the red color values.
-Image Green Channel: A grayscale image used to define the intensity of the green color values.
-Image Blue Channel: A grayscale image used to define the intensity of the blue color values.
-Image Alpha Channel: A grayscale image used to define transparency (white is opaque, black is transparent).
Provides Outputs:
-Image Merged Image: The finalized multi-channel color image resulting from the merge.
|
|
Image
|
Image Pad
|
This node is primarily used to prepare images for outpainting – the process of extending an image beyond its original borders. By adding empty space (padding) around the source, it provides the AI with a workspace to generate new content that blends with the original image.
Expects Inputs:
-Image Image: The source image asset to be padded.
-Integer Left: The number of pixels to add to the left side of the image.
-Integer Top: The number of pixels to add above the image.
-Integer Right: The number of pixels to add to the right side of the image.
-Integer Bottom: The number of pixels to add below the image.
-Text Color (optional): Defines the fill color for the padded areas; if not specified, it typically defaults to black (zero-padding).
Provides Outputs:
-Image Padded Image: The finalized image with the specified padding added to its dimensions.
|
|
Image
|
Image Paint
|
Provides a dedicated manual interface for painting, annotating, and creating multi-layered masks on an image.
Expects Inputs:
-Image Image: The source image to be used as the base canvas for painting.
Provides Outputs:
-Image Edited Image: The primary output that merges the original base image with all painted layers
-Image Drawing: An isolated output containing only the brush strokes on a transparent or neutral background, excluding the original source image.
-Image Mask: A secondary output that may provide only the isolated paint/brush strokes as a separate asset.
|
|
Image
|
Image Resize
|
This node is a fundamental utility for controlling the physical size of image assets. It’s essential for ensuring images meet the specific input requirements of generation models or for preparing assets for final export. The node allows for both proportional scaling and forced aspect ratio changes.
Expects Inputs:
-Image Image: The source image asset to be resized.
-Integer Width: The target horizontal resolution in pixels.
-Integer Height: The target vertical resolution in pixels.
-Boolean Free Scale: A toggle that determines scaling behavior. When False, the node typically maintains the original aspect ratio (using the dimensions as a fit-within box); when True, it stretches the image to match the exact width and height provided.
Provides Outputs:
-Image Resized Image: The processed image asset at the new specified dimensions.
|
|
Image
|
Image Rotate
|
This node is used to adjust the orientation of an image asset. It’s essential for correcting crooked horizon lines in AI-generated landscapes, orienting character references
Expects Inputs:
-Image Image: The source image asset to be rotated.
-Float Angle (degrees): The numerical value for the rotation, where positive values typically rotate clockwise and negative values rotate counter-clockwise.
-Text Color (optional): Defines the fill color for the wedges or empty areas created in the corners when an image is rotated at non-orthogonal angles
Provides Outputs:
-Image Rotated Image: The finalized rotated image asset.
|
|
Image
|
Solid Image
|
This node is used to create base canvases, background layers, or solid-color masks. It is an essential utility for providing a clean background image for the Image Composite Node or for generating a specific color constant to be used in Image Merge operations.
Expects Inputs:
-Integer Width: The horizontal resolution in pixels.
-Integer Height: The vertical resolution in pixels.
-Integer Channels: The number of color channels.
-Text Color: The fill color for the solid image.
Provides Outputs:
-Image Solid Image: The generated solid color image.
|
|
Image
|
Image Split
|
Deconstructs a color image into its fundamental Red, Green, Blue, and Alpha channel components.
Expects Inputs:
-Image Image: The source color image asset you wish to deconstruct.
Provides Outputs:
-Image Red Channel (Top): A grayscale map representing the intensity of red values across the image.
-Image Green Channel (Middle-Top): A grayscale map representing the intensity of green values.
-Image Blue Channel (Middle-Bottom): A grayscale map representing the intensity of blue values.
-Image Alpha Channel (Bottom): A grayscale map representing transparency – fully opaque areas appear white, while transparent areas appear black.
|
|
Image
|
Image 2 Video
|
This node acts as a bridge between image and video processing. It takes a single still frame and stretches it across time to create a video clip.
Expects Inputs:
-Image Image: The source still image intended to be converted.
-Float Duration (secs): The total length of the resulting video file in seconds (e.g., 5 seconds).
-Float Framerate (fps): The playback speed or temporal resolution of the video (e.g., 30 fps).
Provides Outputs:
-Video Video from Image: A video data stream containing the repeated frame sequence.
|
|
IO
|
Load Audio
|
Loads an audio file.
Expects Inputs:
-Audio Audio File: The audio file to load.
Provides Outputs:
-Audio Audio Data: The loaded audio data.
|
|
IO
|
Save Audio
|
Saves an audio file.
Expects Inputs:
-Audio Audio to Save: The audio data to save.
Provides Outputs:
-Audio Audio Saved: The saved audio file.
|
|
IO
|
Load Image
|
Loads an image file.
Expects Inputs:
-Image Image File: The image file to load.
Provides Outputs:
-Image Image: The loaded image.
|
|
IO
|
Save Image
|
Saves an image file.
Expects Inputs:
-Image Image to Save: The image to save.
Provides Outputs:
-Image Image Saved: The saved image.
|
|
IO
|
Load Image Sequence
|
Imports a series of numbered image files and compiles them into a playable video data stream.
Expects Inputs:
-Video Sequence Data: The image sequence to load.
Provides Outputs:
-Video Video: The compiled video stream.
|
|
IO
|
Save Image Sequence
|
This node acts as an export engine that deconstructs video data back into its constituent frames.
Expects Inputs:
-Video Video Input: The video to export as frames.
Provides Outputs:
-Video Video Output: The processed video data.
|
|
IO
|
Load 3D Model
|
Loads a 3D model file.
Expects Inputs:
-Model-3D Model File: The 3D model file to load.
Provides Outputs:
-Model-3D Model: The loaded 3D model.
|
|
IO
|
Save 3D Model
|
Saves a 3D model file.
Expects Inputs:
-Model-3D 3D Model to Save: The 3D model to save.
Provides Outputs:
-Model-3D 3D Model Saved: The saved 3D model.
|
|
IO
|
Load JSON
|
Loads a JSON file and outputs the parsed JSON object.
Expects Inputs:
-Text JSON File: The JSON file to load.
Provides Outputs:
-json JSON Data: The parsed JSON object.
|
|
IO
|
Save JSON
|
Saves a JSON object to a file.
Expects Inputs:
-json JSON to Save: The JSON object to save.
Provides Outputs:
-json JSON Saved: The saved JSON data.
|
|
IO
|
Load Splat Model
|
Loads a Gaussian splat model file.
Expects Inputs:
-Model-splat Splat File: The splat model file to load.
Provides Outputs:
-Model-splat Splat Model: The loaded splat model.
|
|
IO
|
Save Splat Model
|
Saves a Gaussian splat model file.
Expects Inputs:
-Model-splat Splat Model to Save: The splat model to save.
Provides Outputs:
-Model-splat Splat Model Saved: The saved splat model.
|
|
IO
|
Load Text
|
Loads a text file.
Expects Inputs:
-Text Text File: The text file to load.
Provides Outputs:
-Text File Content: The loaded text content.
|
|
IO
|
Save Text
|
Saves a text file.
Expects Inputs:
-Text Text to Save: The text to save.
Provides Outputs:
-Text Text Saved: The saved text.
|
|
IO
|
Load Video
|
Loads a video file.
Expects Inputs:
-Video Video File: The video file to load.
Provides Outputs:
-Video Video: The loaded video.
|
|
IO
|
Load Audio URL
|
Loads audio from a remote URL.
Expects Inputs:
-Text URL: The remote URL of the audio to load.
Provides Outputs:
-Audio Audio Data: The loaded audio data.
|
|
IO
|
Load Image URL
|
Loads an image from a remote URL.
Expects Inputs:
-Text URL: The remote URL of the image to load.
Provides Outputs:
-Image Image Data: The loaded image.
|
|
IO
|
Load 3D Model URL
|
Loads a 3D model from a remote URL.
Expects Inputs:
-Text URL: The remote URL of the 3D model to load.
Provides Outputs:
-Model-3D 3D Model: The loaded 3D model.
|
|
IO
|
Load Splat URL
|
Loads a splat model from a remote URL.
Expects Inputs:
-Text URL: The remote URL of the splat model to load.
Provides Outputs:
-Model-splat Splat Model: The loaded splat model.
|
|
IO
|
Load Video URL
|
Loads a video from a remote URL.
Expects Inputs:
-Text URL: The remote URL of the video to load.
Provides Outputs:
-Video Video Data: The loaded video.
|
|
IO
|
Save Video
|
Saves a video file.
Expects Inputs:
-Video Video to Save: The video to save.
Provides Outputs:
-Video Video Saved: The saved video.
|
|
IO
|
Video LiLo
|
Video LiLo (Lots In, Lots Out) runs multiple parallel copies of the upstream video graph and collects results into separate output slots. Each copy executes independently, enabling batch video generation from a single pipeline.
Expects Inputs:
-Video Video to Save: The video pipeline to multiply.
-String Count: Number of parallel executions (1-16).
Provides Outputs:
-Video Video 1-16: Individual video outputs from each parallel execution.
|
|
IO
|
Image LiLo
|
Image LiLo (Lots In, Lots Out) runs multiple parallel copies of the upstream image graph and collects results into separate output slots. Each copy executes independently, enabling batch image generation from a single pipeline.
Expects Inputs:
-Image Image to Save: The image pipeline to multiply.
-String Count: Number of parallel executions (1-16).
Provides Outputs:
-Image Image 1-16: Individual image outputs from each parallel execution.
|
|
JSON
|
JSON -> Text
|
Converts a JSON object to a formatted text.
Expects Inputs:
-json JSON Data: The JSON object to convert.
Provides Outputs:
-Text JSON Text: The formatted text.
|
|
JSON
|
JSON Get Key
|
Extracts a value from a JSON object using a dot-notation key path (e.g. data.users.0.name). Numeric path segments are treated as array indices. The result is always returned as a text.
Expects Inputs:
-json JSON Data: The JSON object to extract from.
-Text Key Path: Dot-notation path to the value (e.g. data.users.0.name).
Provides Outputs:
-Text Value: The extracted value as a text.
|
|
JSON
|
Text -> JSON
|
Parses a JSON-formatted text into a JSON object.
Expects Inputs:
-Text JSON Text: The JSON-formatted text to parse.
Provides Outputs:
-json Parsed JSON: The parsed JSON object.
|
|
Math
|
Add
|
Adds two numbers.
Expects Inputs:
-Float a: First number.
-Float b: Second number.
Provides Outputs:
-Float sum: The sum of a and b.
|
|
Math
|
Divide
|
Divides two numbers.
Expects Inputs:
-Float a: Dividend.
-Float b: Divisor.
Provides Outputs:
-Float quotient: The result of a divided by b.
|
|
Math
|
Expression
|
Evaluates a mathematical expression.
Expects Inputs:
-Text Expression: The mathematical expression to evaluate.
Provides Outputs:
-Float Result (float): The result as a floating point number.
-Integer Result (integer): The result as an integer.
-Boolean Result (boolean): The result as a boolean.
|
|
Math
|
Float -> Int
|
Converts a float to an integer.
Expects Inputs:
-Float f: The float value to convert.
Provides Outputs:
-Integer int: The integer result.
|
|
Math
|
Int -> Float
|
Converts an integer to a float.
Expects Inputs:
-Integer i: The integer value to convert.
Provides Outputs:
-Float float: The float result.
|
|
Math
|
Multiply
|
Multiplies two numbers.
Expects Inputs:
-Float a: First number.
-Float b: Second number.
Provides Outputs:
-Float product: The product of a and b.
|
|
Math
|
Power
|
Raises a number to a power.
Expects Inputs:
-Float a: The base number.
-Float b: The exponent.
Provides Outputs:
-Float power: The result of a raised to the power of b.
|
|
Math
|
Subtract
|
Subtracts two numbers.
Expects Inputs:
-Float a: First number.
-Float b: Second number.
Provides Outputs:
-Float difference: The result of a minus b.
|
|
Model
|
GLB Apply Textures
|
Appends external PBR textures to a GLB's binary payload and updates the first material definition to reference them.
Expects Inputs:
-Image Base Color: The base color material.
-Image Metallic Roughness: The metallic roughness component.
-Image Normal Map: The normal map.
-Image Occlusion Map: The occlusion map.
-Image Emissive Map: The emissive map.
Provides Outputs:
-Model Textured Model: The model with the materials and textures applied.
|
|
Model
|
GLB Extract Textures
|
Extracts standard PBR textures from the given material index of a GLB file.
Expects Inputs:
-Model Model: The input GLB model with materials and textures.
-Integer Material Index: Optional material index to extract.
Provides Outputs:
-Image Base Color: The base color material.
-Image Metallic Roughness: The metallic roughness component.
-Image Normal Map: The normal map.
-Image Occlusion Map: The occlusion map.
-Image Emissive Map: The emissive map.
|
|
Model
|
GLB Strip Textures
|
Removes all material, texture, and image references from a GLB 3D model.
Expects Inputs:
-Model Model: The input GLB model with materials and textures to be removed.
Provides Outputs:
-Model Model: The output GLB model with the materials and textures removed.
|
|
ShotGrid
|
Load Shotgrid Published Image
|
Loads a published file image from the currently active Shotgrid project.
Expects Inputs:
-Text Published Metadata: Publish metadata retrieved from Shotgrid.
Provides Outputs:
-Image Image: Image from the published metadata.
|
|
ShotGrid
|
Load Shotgrid Published Video
|
Loads a published file video from the currently active Shotgrid project.
Expects Inputs:
-Text Published Metadata: Publish metadata retrieved from Shotgrid.
Provides Outputs:
-Video Video: Video from the published metadata.
|
|
ShotGrid
|
Shotgrid Published Files
|
Provides the ability to specify filters for Shot, Task, Name Search, Pipeline Step, and Published File Type to generate a selection list of ShotGrid published files.
Provides Outputs:
|
|
Text
|
Int -> Text
|
Converts an integer to text.
Expects Inputs:
-Integer I: The integer to convert.
Provides Outputs:
-Text Text: The text representation.
|
|
Text
|
Text -> Int
|
Converts text to an integer.
Expects Inputs:
-Text Text: The text to convert.
Provides Outputs:
-Integer Int: The integer result.
|
|
Text
|
Float -> Text
|
Converts a float to a text.
Expects Inputs:
-Float F: The float to convert.
Provides Outputs:
-Text Text: The text representation.
|
|
Text
|
Text -> Float
|
Converts a text to a float.
Expects Inputs:
-Text Text: The text to convert.
Provides Outputs:
-Float Float: The float result.
|
|
Text
|
Text Concatenate
|
Concatenates two texts.
Expects Inputs:
-Text Text 1: The first text.
-Text Text 2: The second text.
Provides Outputs:
-Text Concatenated Text: The combined text.
|
|
Text
|
Text Portion
|
Gets a subtext of a text.
Expects Inputs:
-Text Text: The source text.
-Integer 1st Character (from 1): The starting character position.
-Integer Number of Characters: The number of characters to extract.
Provides Outputs:
-Text Portion: The extracted subtext.
|
|
Video
|
Video Audio Mix
|
Synchronizes independent audio and video streams into a unified media asset with precise timing and volume control.
Expects Inputs:
-Video Video: The primary visual data stream.
-Audio Audio: The sound file or audio stream to be combined with the video.
-Float Mix (-1.0 to 1.0): Adjusts the output volume balance. 0 represents the original volume, while negative values attenuate and positive values boost the signal.
-Float Audio Delay (secs): Offsets the audio start time relative to the video. Positive values delay the audio, while negative values make the audio start earlier to fix sync drift.
Provides Outputs:
-Video Video With Audio: A finalized media container containing both the visual and auditory tracks synced together.
|
|
Video
|
Video Audio Split
|
Split the audio track out of a video with an audio track.
Expects Inputs:
-
Video Video: The video with the audio track to be extracted
-
Boolean Strip Audio from Video: If true, remove the audio track from the returned video.
Provides Outputs:
-
Video Video: The original video, optionally with the audio removed.
-
Audio Audio: The extracted audio track (if any).
|
|
Video
|
Video Color Correction
|
This node provides a streamlined interface to modify the tonal and color characteristics of a video. It is essential for matching AI-generated elements with real-world footage or correcting lighting inconsistencies within a workflow.
Expects Inputs:
-Video Video: The source video to be color corrected.
Provides Outputs:
-Video Edited Video: The color corrected video.
|
|
Video
|
Video Composite
|
It allows you to layer a foreground video (Overlay) onto a base video (Background) while using a third video stream (Mask) to define visibility.
Expects Inputs:
-Video Background Video: The base video layer that provides the foundation for the composition.
-Video Overlay Video: The foreground video layer to be placed on top of the background.
-Video Overlay Mask Video: A grayscale video stream that determines the transparency of the Overlay; white areas are visible, and black areas are hidden.
-Float Overlay Delay (secs): Offsets the start time of the Overlay and Mask videos relative to the Background. A positive value waits before starting the overlay, while a negative value starts it earlier.
Provides Outputs:
-Video Composited Video: The finalized composite video stream with layers merged.
|
|
Video
|
Video Concat
|
This node acts as a basic non-linear editor within the node graph. It appends Video 2 directly to the end of Video 1, allowing for the creation of multi-shot sequences or the stitching together of AI-generated clips without needing an external video editor.
Expects Inputs:
-Video Video 1: The primary video clip that will appear first in the sequence.
-Video Video 2: The second video clip that will be appended to the first.
-Boolean Trim Joining Frame: A toggle that, when enabled, removes the overlapping or redundant frame at the exact point where the two videos meet
Provides Outputs:
-Video Concatenated Video: A single video data stream containing the combined sequence of both input clips.
|
|
Video
|
Video Crop
|
Extracts a specific rectangular sub-section of a video by defining precise pixel offsets and dimensions.
Expects Inputs:
-Video Video: The source image to be cropped.
-Integer Left: The horizontal starting point (X-coordinate) for the crop, measured in pixels from the left edge.
-Integer Top: The vertical starting point (Y-coordinate) for the crop, measured in pixels from the top edge.
-Integer Width: The horizontal length of the final cropped area.
-Integer Height: The vertical length of the final cropped area.
Provides Outputs:
-Video Cropped Video: Cropped Image
|
|
Video
|
Video Crop w/ UI
|
Extracts a specific rectangular sub-section of a video through a dedicated graphical interface.
Expects Inputs:
-Video Video: The source video to be cropped.
Provides Outputs:
-Video Cropped Video: The cropped video.
|
|
Video
|
Video Details
|
This node acts as an inspector for video files, deconstructing a video stream into its fundamental technical specifications
Expects Inputs:
-Video Video: The source image asset you wish to analyze.
Provides Outputs:
-Integer Width: The horizontal resolution of the video in pixels (e.g., 1920).
-Integer Height: The vertical resolution of the video in pixels (e.g., 1080).
-Float FPS: The temporal resolution or playback speed in frames per second (e.g., 29.97).
-Float Duration: The total length of the video file in seconds (e.g., 106.44).
-Integer Audio Channels: The number of independent audio tracks detected (e.g., 2 for Stereo).
|
|
Video
|
Video Edit
|
This node serves as a "Human-in-the-Loop" editor within a node-based workflow. It allows users to manually define the start and end points of a clip, rearrange sequences, or select specific segments for AI processing. This is essential for isolating a particular action within a long video before sending it to more resource-heavy nodes like Topaz Video Enhance Resolution.
Expects Inputs:
-Video Video: The source video to be edited.
Provides Outputs:
-Video Edited Video: The edited video.
|
|
Video
|
Video FlipFlop
|
Mirrors a video vertically (Flip), horizontally (Flop), or both (FlipFlop).
Expects Inputs:
-Video Video: The source video to transform.
-Text Mode: The mirroring direction — Flip mirrors vertically (top ↔ bottom), Flop mirrors horizontally (left ↔ right), FlipFlop applies both.
Provides Outputs:
-Video Flipped Video: The mirrored video.
|
|
Video
|
Video Frame
|
This node acts as a "frame-grabber," allowing you to isolate one moment in time from a video clip
Expects Inputs:
-Video Video: The source video stream from which the frame will be pulled.
-Integer Frame Number: The exact index of the frame to be extracted (e.g., frame 0 for the very first frame).
Provides Outputs:
-Image Frame Image: The extracted still frame as a standard image asset.
|
|
Video
|
Video Grey Convert
|
This node is used to strip color information from a video, converting it into a single-channel grayscale representation. It’s essential for creating luminance masks, preparing videos for specific AI depth-analysis models, or achieving a classic black-and-white aesthetic.
Expects Inputs:
-Video Video: The source video to convert.
Provides Outputs:
-Video Converted Video: The grayscale video.
|
|
Video
|
Video Mask
|
This node is a hybrid compositing tool that layers a foreground video onto a background video based on a single, non-moving image mask. It is ideal for picture-in-picture effects, static logo overlays, or framing a video within a specific static shape (like a circle or border) throughout its entire duration.
Expects Inputs:
-Video Background Video: The primary video stream that serves as the bottom layer.
-Video Overlay Video: The secondary video stream to be placed on top.
-Image Overlay Mask Image: A static grayscale image where white pixels reveal the overlay and black pixels hide it.
-Float Overlay Delay (secs): Offsets the start time of the overlay video relative to the background.
Provides Outputs:
-Video Masked Video: The finalized video stream with the masked overlay applied.
|
|
Video
|
Video Pad
|
This node is primarily used to prepare images for outpainting – the process of extending an image beyond its original borders. By adding empty space (padding) around the source, it provides the AI with a workspace to generate new content that blends with the original image.
Expects Inputs:
-Video Video: The source image asset to be padded.
-Integer Left: The number of pixels to add to the left side of the image.
-Integer Top: The number of pixels to add above the image.
-Integer Right: The number of pixels to add to the right side of the image.
-Integer Bottom: The number of pixels to add below the image.
-Text Color (optional): Defines the fill color for the padded areas; if not specified, it typically defaults to black (zero-padding).
Provides Outputs:
-Video Padded Video: The finalized image with the specified padding added to its dimensions.
|
|
Video
|
Video Resize
|
This node is a fundamental utility for controlling the physical size of video assets. It’s essential for ensuring videos meet the specific input requirements of generation models or for preparing videos for final export. The node allows for both proportional scaling and forced aspect ratio changes.
Expects Inputs:
-Video Video: The source video asset to be resized.
-Integer Width: The target horizontal resolution in pixels.
-Integer Height: The target vertical resolution in pixels.
-Boolean Free Scale: A toggle that determines scaling behavior. When False, the node typically maintains the original aspect ratio (using the dimensions as a "fit-within" box); when True, it stretches the image to match the exact width and height provided.
Provides Outputs:
-Video Resized Video: The processed video asset at the new specified dimensions.
|
|
Video
|
Video Reverse
|
This node mathematically reorders the frames of an input video so that the last frame becomes the first and the first frame becomes the last. It’s a creative utility used for boomerang-style loops, corrective temporal adjustments, or achieving specific visual storytelling effects in a video pipeline.
Expects Inputs:
-Video Video: The source video to reverse.
Provides Outputs:
-Video Reversed Video: The reversed video.
|
|
Video
|
Video Rotate
|
This node is used to adjust the orientation of a video asset. It’s essential for correcting crooked horizon lines in AI-generated landscapes, orienting character references
Expects Inputs:
-Video video: The source video asset to be rotated.
-Float Angle (degrees): The numerical value for the rotation, where positive values typically rotate clockwise and negative values rotate counter-clockwise.
-Text Color (optional): Defines the fill color for the wedges or empty areas created in the corners when a video is rotated at non-orthogonal angles
Provides Outputs:
-Video Rotated Video: The finalized rotated video asset.
|
|
Video
|
Video Split
|
Deconstructs a color video into its fundamental Red, Green, Blue, and Audio channel components.
Expects Inputs:
-Video Input Video: The source color video asset you wish to deconstruct.
Provides Outputs:
-Video Red Channel (Top): A grayscale map representing the intensity of red values across the image.
-Video Green Channel (Middle-Top): A grayscale map representing the intensity of green values.
-Video Blue Channel (Middle-Bottom): A grayscale map representing the intensity of blue values.
-Audio Audio: The independent audio data stream extracted from the video container.
|
|
Video
|
Video Trim
|
This node is used to isolate a specific portion of a long video file without needing to launch the full Video Editor interface.
Expects Inputs:
-Video Video: The source video stream to be trimmed.
-Integer Start (frame): The exact frame index where the new clip should begin.
-Float Duration (secs): The total length of the resulting clip in seconds.
Provides Outputs:
-Video Trimmed Video: A new video data stream containing only the specified temporal segment.
|
|
Root
|
Float
|
A floating-point number constant.
Expects Inputs:
-Float Value: The float value.
Provides Outputs:
-Float Value: The float value.
|
|
Root
|
Integer
|
An integer constant.
Expects Inputs:
-Integer Value: The integer value.
Provides Outputs:
-Integer Value: The integer value.
|
|
Root
|
Text
|
A text constant.
Expects Inputs:
-Text text_data: The text value.
Provides Outputs:
-Text text_data: The text value.
|
|
Root
|
Boolean
|
A boolean constant.
Expects Inputs:
-Text Value: True or False.
Provides Outputs:
-Boolean Value: The boolean value.
|
|
GenAI/3D
|
Meshy Text to 3D
|
The Meshy Text to 3D Node generates a 3D model from a text description using AI. Unlike the image-based Meshy node, this variant relies entirely on a written prompt to define the object's shape and appearance.
Expects Inputs:
-Text Prompt: A text description of the 3D object to generate (e.g., ‘a medieval wooden shield’).
-Boolean Create Humanoid: A toggle that optimizes the generation pipeline for bipedal character models when enabled.
-Integer Max Triangles: A cap on the mesh density (polygon count) to optimize performance.
-Text Texture Prompt: A text description guiding the look, material, and style of the surface.
-Image Texture Image: A reference image to guide specific patterns or material colors.
Provides Outputs:
-Model-3D Generated Model: The resulting 3D model file, typically including vertex data and mapped textures.
|
|
GenAI/3D
|
Meshy Textureizer
|
The Meshy Textureizer Node applies AI-generated textures to an existing 3D model. It takes a bare or previously textured mesh and re-skins it based on a text prompt and optional reference image, allowing rapid iteration on surface appearance without regenerating the geometry.
Expects Inputs:
-Model-3D Model: The source 3D mesh to be textured.
-Text Texture Prompt: A text description guiding the desired material, color, and surface style.
-Image Texture Image: A visual reference to guide the patterns or colors of the texture map.
-Boolean Ignore Original UVs: When enabled, the AI discards the model's existing UV layout and generates a new one optimized for the new texture.
Provides Outputs:
-Model-3D Textured Model: The 3D model with newly generated texture maps applied.
|
|
GenAI/3D
|
Meshy Rigger
|
The Meshy Rigger Node automatically generates a skeletal rig for a 3D model using AI, making it ready for animation. It analyzes the mesh geometry to place bones and joint hierarchies appropriate for the model's shape.
Expects Inputs:
-Model-3D Model: The source 3D mesh to be rigged.
-Float Height (meters): The target real-world height of the model in meters, used to correctly scale the skeleton.
-Boolean Animate: When enabled, applies a default animation cycle to the rigged model for immediate preview.
Provides Outputs:
-Model-3D Rigged Model: The 3D model with an embedded skeletal rig ready for animation.
|
|
GenAI/3D
|
Meshy Retopologizer
|
The Meshy Retopologizer Node rebuilds the polygon topology of a 3D model using AI. It replaces dense, irregular meshes (common in AI-generated or sculpted models) with cleaner, more efficient geometry suitable for animation, real-time rendering, or further production work.
Expects Inputs:
-Model-3D Model: The source 3D mesh to be retopologized.
-Boolean Generate Quads: When enabled, the output mesh uses quad-dominant topology instead of triangles, which is preferred for subdivision and animation workflows.
-Integer Max Polygons: A cap on the total polygon count for the retopologized mesh.
-Float Height (meters): The target real-world height of the model in meters, used for correct scaling.
-Boolean Set Origin to Bottom: When enabled, repositions the model's origin point to the bottom center of the bounding box, which is standard for placing characters on ground planes.
Provides Outputs:
-Model-3D Retopologized Model: The rebuilt 3D model with optimized topology.
|
|
ShotGrid
|
ShotGrid Publish
|
The ShotGrid Publish Node uploads and registers an asset file to the connected ShotGrid project. It creates two published file entities in ShotGrid, one for the asset and one for the graph from which the publish was made. Both are provided as outputs for downstream pipeline consumers and review workflows.
Requires:
Expects Inputs:
-Text ShotGrid Context: A JSON text containing the names and IDs of the selected Project, Shot, and Task (e.g., from the ShotGrid Context node).
Provides Outputs:
-Text Last Published Asset: A JSON text representation of the asset PublishedFile entity created by this node, including its ID, name, and path.
-Text Last Published Graph: A JSON text representation of the graph PublishedFile entity created by this node, including its ID, name, and path.
|
|
Image
|
Image Composite w/ UI
|
The Image Composite w/ UI Node provides an interactive graphical interface for layering a foreground element (Overlay) onto a base layer (Background). Unlike the standard Image Composite Node, which requires a separate mask input, this variant includes a built-in visual tool for positioning, scaling, and adjusting the overlay directly. It supports traditional blend modes, including Normal, Multiply, Screen, Overlay, Darken, Lighten, Difference, Exclusion, Hard Light, Soft Light, Color Dodge, Color Burn, Hue, Saturation, Color, and Luminosity.
Expects Inputs:
-Image Background Image: The base layer that serves as the foundation of the composite.
-Image Overlay Image: The foreground element to be placed on top of the background.
-Composite Tool: An interactive UI control for visually adjusting the overlay's position, scale, and blending mode within the composite.
Provides Outputs:
-Image Composited Image: The finalized composite image with the layers merged according to the UI settings.
|
|
OKO
|
OKO Space Selection
|
Allows the user to select an OKO space from the available options.
Provides Outputs:
-Text Space ID: The OKO Space ID.
|
|
OKO
|
OKO Publish Splat
|
Allows the user to publish a splat asset to an OKO space library.
Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Image Asset Thumbnail: The thumbnail image for the asset.
-Model-splat Asset Splat Model: The splat model file for the asset.
|
|
OKO
|
OKO Publish Model
|
Allows the user to publish a model asset to an OKO space library.
Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-model-3d Asset Model: The 3D model file for the asset.
|
|
OKO
|
OKO Publish Audio
|
Allows the user to publish an audio asset to an OKO space library.
Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Audio Asset Audio: The audio file for the asset.
|
|
OKO
|
OKO Publish Video
|
Allows the user to publish a video asset to an OKO space library.
Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Image Asset Thumbnail: The thumbnail image for the asset.
-Video Asset Video: The video file for the asset.
|
|
OKO
|
OKO Publish Image
|
Allows the user to publish an image asset to an OKO space library.
Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Image Asset Image: The image file for the asset.
|
|
OKO
|
OKO Asset Selection
|
Allows the user to select an OKO asset from the available options.
Expects Inputs:
-Text Space ID: The OKO Space ID.
Provides Outputs:
-Text Asset URL: The OKO Asset URL.
|
|
GenAI/Fal
|
Happy Horse Image-To-Video
|
Generates video from a still image using Alibaba's Happy Horse 1.0 model. The input image is used as the first frame, with optional text guidance. Supports up to 1080p resolution and 3-15 seconds of video with synchronized native audio.
Expects Inputs:
-Image Image: The source image to animate (min 300px, aspect ratio 1:2.5 to 2.5:1, max 10 MB).
-Text Prompt: Optional text guidance for the animation (max 2500 characters).
-Integer Seed: Seed for reproducibility.
-Text Resolution: Output video resolution – 720p or 1080p.
-Text Duration: Output video duration in seconds (3-15).
Provides Outputs:
-Video Generated Video: The AI-generated video.
|
|
GenAI/Fal
|
Seedance V1.5 Pro Image-To-Video
|
Generates high-quality videos from text descriptions and images using the Seedance 1.5 Pro model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
|
Generates video from multi-modal references using ByteDance's Seedance 2.0 model. Supports up to 9 images, 3 videos, and 3 audio clips as reference inputs, with no more than 12 total reference items. Reference them in the prompt as @Image1, @Video1, @Audio1, etc.
Expects Inputs:
-Text Prompt: Text description for the video. Reference media using @Image1, @Video1, @Audio1, etc.
-Image Image 1–9: Up to 9 reference images (max 30 MB each, JPEG/PNG/WebP).
-Video Video 1–3: Up to 3 reference videos (combined max 50 MB, 2–15s total duration).
-Audio Audio 1–3: Up to 3 reference audio clips (max 15 MB each, combined max 15s). Requires at least one image or video.
-Integer Seed: Seed for reproducibility.
-Text Generate Audio: Whether to generate synchronized audio.
-Text Aspect Ratio: Output aspect ratio (auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16).
-Text Resolution: Output resolution (480p, 720p, 1080p).
-Text Duration: Video length in seconds (auto, or 4–15).
Provides Outputs:
-Video Generated Video: The AI-generated video.
|
|
GenAI/Fal
|
Seedance V1.5 Pro Text-To-Video
|
Generates high-quality videos from text descriptions using the Seedance 1.5 Pro model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 480p, 720p, or 1080p.
-Text Resolution: The resolution of video to generate, any of 5 or 10 seconds.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
Seedance V2.0 Image-To-Video
|
Generates high-quality videos from text descriptions and images using the Seedance 2.0 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
Seedance V2.0 Text-To-Video
|
Generates high-quality videos from text descriptions using the Seedance 2.0 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 480p, 720p, or 1080p.
-Text Resolution: The resolution of video to generate, any of 5 or 10 seconds.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
Kling O3 Pro Video Edit
|
Edit videos using Kling O3.
Expects Inputs:
-Text Prompt: Prompt text. Reference video as @Video1, Reference images as @Image1, etc, and Elements as @Element1.
-Video Video: Reference video.
-Image Front Image: Image to use as the front image.
-Image Alternate Image: An alternate view of the image.
-Image Reference 1: Reference image for style/appearance.
-Image Reference 2: Reference image for style/appearance.
-Text Keep Audio: Keep or discard original sound.
Provides Outputs:
-Video Generated Video: The edited video.
|
|
GenAI/Fal
|
Kling O3 Pro Video Reference
|
Kling O3 generates new shots guided by the input reference video.
Expects Inputs:
-Text Prompt: Prompt text. Reference video as @Video1, Reference images as @Image1, etc, and Elements as @Element1.
-Video Video: Reference video.
-Image Front Image: Image to use as the front image.
-Image Alternate Image: An alternate view of the image.
-Image Reference 1: Reference image for style/appearance.
-Image Reference 2: Reference image for style/appearance.
-Text Keep Audio: Keep or discard original sound.
-Text Aspect Ratio: Any of auto, 16:9, 9:16, 1:1, where auto infers the aspect ratio from the input video.
-Text Duration: Any of 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, number of seconds to generate.
Provides Outputs:
-Video Generated Video: The generated video.
|
|
GenAI/Fal
|
Kling V3 Video Lip-sync
|
Lip-sync a video to an audio track.
Expects Inputs:
-Video Video: Reference video.
-Audio Audio: Audio to sync with the video.
-Text Sync Mode: Cut Off, Loop, Bounce, Silence, or Remap.
Provides Outputs:
-Video Generated Video: The resultant generated lip-synced video.
|
|
GenAI/Fal
|
Kling V3 Pro Video Motion Control
|
Transfer movements from a reference video to any character image.
Expects Inputs:
-Text Prompt: Prompt text.
-Video Video: Reference video. The character actions will be consistent with this reference video.
-Image Image: Reference image. Characters and backgrounds are based on this image.
-Text Keep Original Sound: Whether to keep original sound (default Keep Sound).
-Text Character Source: Choose Character from Image when the image should define the character, and the result should better follow camera movement (max 10s). Choose Character from Video when the video should define the character, and it’s better suited for complex motions (max 30s).
Provides Outputs:
-Video Generated Video: The resultant generated motion controlled video.
|
|
GenAI/Fal
|
LTX 2.3 Audio to Video
|
Generates a video synchronized to an input audio clip using the LTX 2.3 model. Audio duration must be 2-20 seconds.
Expects Inputs:
-Audio Audio: The audio clip to generate a video from (2-20 seconds).
-Text Prompt: Text description of how the video should look. Required if no start image is provided.
-Image Start Image (Optional): An image to use as the first frame of the video.
-Float Guidance Scale: Controls how closely the output follows the prompt. Defaults to 5 for text, 9 with an image.
-Text Aspect Ratio: The aspect ratio of the generated video (auto, 16:9, or 9:16).
Provides Outputs:
-Video Generated Video: The final AI-generated video file synchronized to the audio.
|
|
GenAI/Fal
|
LTX 2.3 HDR
|
Converts SDR video to HDR using a self-hosted LTX 2.3 IC-LoRA container. Produces a lossless 16-bit H.265 MP4 in ACEScg colour space and/or a tonemapped MP4 preview.
Expects Inputs:
-Video Video: The SDR video to convert to HDR.
-Text Host: Hostname or IP of the LTX-HDR container (default: 127.0.0.1).
-Integer Seed: Seed for reproducibility (default: 10).
-Integer Max Frames: Maximum number of frames to process (default: 161, max: 161).
-Integer Inference Steps (optional): Override for stage 1 inference steps.
-Integer Stage 2 Inference Steps (optional): Override for stage 2 (upscaler) inference steps.
-Text Output Mode: SDR and HDR or SDR Only.
Provides Outputs:
-Video SDR Preview: Tonemapped SDR MP4 preview.
-Video HDR Video: Lossless 16-bit H.265 MP4 in ACEScg colour space.
|
|
GenAI/Fal
|
LTX 2.3 Image to Video
|
Generates high-quality videos from text descriptions and images using the LTX 2.3 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Image Last Image (optional): The last image to use to drive the video creation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of auto, 16:9, 9:16.
-Text Resolution: The resolution of video to generate, any of 1080p, 1440p, and 2160p.
-Text Duration: The duration of the video to generate, any of 6, 8, or 10 seconds.
-Text FPS: The frames per second of the video to generate, any of 24, 25, 48, or 50.
-Text Audio: Audio or No Audio.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
LTX 2.3 Reference Video to Video
|
Generates a video from a reference video and text prompt using the LTX 2.3 22B model. Can optionally use audio and start/end images.
Expects Inputs:
-Text Prompt: The text description detailing the desired video.
-Video Reference Video: The source video to reference.
-Audio Audio (Optional): Optional audio to use for the video.
-Image Start Image (Optional): Image to use as the first frame.
-Image End Image (Optional): Image to use as the last frame.
-Text Negative Prompt: Text describing behaviors to suppress.
-Integer Seed: Seed for reproducibility.
-Integer Inference Steps: Number of inference steps.
-JSON Tuning Dictionary (optional): Tunable parameters in a JSON dictionary, accepts keys with float values: video_cfg_scale, video_stg_scale, video_rescaling_scale, video_modality_scale, audio_cfg_scale, audio_stg_scale, audio_rescaling_scale, audio_modality_scale, gradient_estimation_gamma, camera_lora_scale, distill_lora_first_pass_scale, distill_lora_second_pass_scale, video_strength, audio_strength.
-Integer Frame Count: If not matching video length, the number of frames to generate.
-Text Match Video Length: Whether to match output to the input video length or use frame count.
-Text Aspect Ratio: Resulting aspect ratio of the generated video.
-Text Generate Audio: Whether to generate audio.
-Text Use Multiscale: Generate coherently starting from a smaller version.
-Text Camera LoRA: Camera movement to apply to the generated video.
-Text Preprocessor: Preprocessing to apply to the reference video.
-Text Video Quality: Quality of the generated video.
Provides Outputs:
-Video Generated Video: The final AI-generated video file.
|
|
GenAI/Fal
|
LTX 2.3 Text to Video
|
Generates high-quality videos from text descriptions using the LTX 2.3 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 16:9 or 9:16.
-Text Resolution: The resolution of video to generate, any of 1080p, 1440p, or 2160p.
-Text Duration: The duration of the video to generate, any of 6, 8, or 10 seconds.
-Text FPS: The frames per second of the video to generate, any of 24, 25, 48, or 50.
-Text Audio: Audio or No Audio.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
LTX 2.3 Video Extension
|
Extends an existing video, preserving the action in the original video.
Expects Inputs:
-Video Video: The input video to be extended.
-Text Prompt (optional): The prompt to guide the extension of the video.
-Float Duration (secs): The duration of the video extension.
-Float Seconds to reference: The number of seconds of the input video to use as guidance for the extension.
-Text Mode: Any of Start or End, whether to extend the start or the end of the input video.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
LTX 2.3 Retake Video
|
Generates a new video from an existing video with the content replaced as defined in the prompt.
Expects Inputs:
-Video Video: The input video to be modified (currently limited to about 15 MB).
-Text Prompt: The prompt guiding the video modification, describing what should be changed in the video and how it should be updated.
-Float Start at (secs): The time at which to start applying the updates.
-Float Duration (secs): The amount of time to apply the updates.
-Text Retake Mode: The mode to use for the retake, any of Replace Audio, Replace Video, or Replace Audio and Video.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
WAN 2.2 Video Style Transfer
|
Transfers a character or environment style from an image onto an existing video.
Expects Inputs:
-Video Video: The source video upon which the style will be conveyed.
-Image Image: The source image from which to draw the style to be conveyed.
-Float CFG: Classifier-free guidance influencing the adherence to the style of the image onto the video.
-Integer Inference Steps: The number of inference steps in the process – higher is more accurate but it takes longer.
-Float Shift: Influences the effect the image will have on the video, between 1.0 and 10.0.
-Integer Seed: Influences the determinism of the generation.
-Text Mode: Either Character Transfer or Style Transfer, which to transfer from the image.
-Text Resolution: Any of 480p, 580p, or 720p.
-Text Video Quality: Any of Low, Medium, High, or Maximum.
-Text Turbo: Any of Standard or Turbo, influences the speed of inference with a tradeoff in quality.
Provides Outputs:
-Video Generated Video: The resultant generated video with the style transferred from the image.
|
|
GenAI/Fal
|
WAN 2.5 Image to Video
|
Generates high-quality videos from text descriptions and images using the Wan 2.5 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Text Negative Prompt: The text description defining behaviors to be suppressed in video generation.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
WAN 2.5 Text to Video
|
Generates high-quality videos from text descriptions using the Wan 2.5 model.
Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Text Negative Prompt: The text description defining behaviors you wish to suppress.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 1:1, 16:9, or 9:16.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.
Provides Outputs:
-Video Output: The final AI-generated video file.
|
|
GenAI/Fal
|
WAN 2.6 Reference to Video
|
Generate a video using reference videos for character/subject consistency (R2V). Models characters referenced as @Video1, @Video2, @Video3 in the prompt.
Expects Inputs:
-Text Prompt: The text prompt describing the desired video.
-Video Video 1: The first reference video.
-Video Video 2: The second reference video.
-Video Video 3: The third reference video.
-Text Negative Prompt: The text prompt describing the desired video.
-Integer Seed: The seed for the random number generator.
-Text Aspect Ratio: The aspect ratio of the generated video.
-Text Resolution: The resolution of the generated video.
-Text Duration: The duration of the generated video.
-Text Multi Shot: Whether the generated video should be a multi-shot video.
Provides Outputs:
-Video Generated Video: The final AI-generated video file.
|
|
GenAI/Fal
|
WAN 2.7 Edit Video
|
Edits a video using the Wan 2.7 Video Edit model. Supports instruction-based editing, reference image-based editing, and video style transfer.
Expects Inputs:
-Video Input Video: The input video to be edited.
-Text Prompt: Editing instruction or style transfer description.
-Image Reference Image: An optional reference image URL for reference-based editing.
-Integer Seed: The seed for the random number generator.
-Text Resolution: Output video resolution tier (720p or 1080p).
-Text Aspect Ratio: Aspect ratio of the generated video.
-Text Audio Setting: Audio handling (Auto Audio or Original Audio).
Provides Outputs:
-Video Generated Video: The final AI-generated video file.
|
|
GenAI/Fal
|
WAN Motion
|
Transfers motion from a driving video onto a reference character image using the Wan Motion model.
Expects Inputs:
-Video Driving Video: The driving video that provides the motion.
-Image Reference Image: The reference image that provides the character’s appearance.
-Text Prompt: An optional text prompt describing the desired video.
-Integer Seed: The seed for the random number generator.
-Text Acceleration: The acceleration level to use.
-Text Adapt Motion: Whether to adapt the driving video's motion to match the reference image's body proportions.
Provides Outputs:
-Video Generated Video: The final AI-generated video file.
|
|
Context
|
Shot Context
|
Captures shot-level context for graph execution and governance.
Expects Inputs:
-Text Shot ID: The shot identifier to use for this graph run.
Provides Outputs:
-Text Shot ID: The normalised shot identifier (trimmed).
-Text Shot Payload: A JSON string payload containing the shot ID (for example: {shot_id: sh010}).
|