Fine tuning Stable Diffusion with Flux for character consistency
Myint Lwin, Ye Yint; Zhang, Xin (2025)
Myint Lwin, Ye Yint
Zhang, Xin
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025121938693
https://urn.fi/URN:NBN:fi:amk-2025121938693
Tiivistelmä
This thesis explores how to achieve consistent character identity in AI generated im-ages using Stable Diffusion Flux. Current diffusion models often produce visually co-herent images but fail to maintain stable facial features and attributes across different scenes. To address this the study introduces a Character Consistency Evaluation Da-taset (CCED) and a multi metric evaluation framework combining DeepFace, CLIP, and LPIPS to assess identity, semantic alignment, and perceptual fidelity.
In parallel, a reuseable Prompt & Embedding Template Library was developed using Textual Inversion and LoRA fine tuning. Tests across portrait, contextual, and stylized templates show that LoRA significantly improves identity preservation compared to prompt only baselines and remains consistent across randomized seeds.
The results were implemented into an interactive Gradio based evaluation tool. The study provides a practical method for improving and measuring character consistency in generative AI, supporting future research on fine tuning oriented diffusion models.
In parallel, a reuseable Prompt & Embedding Template Library was developed using Textual Inversion and LoRA fine tuning. Tests across portrait, contextual, and stylized templates show that LoRA significantly improves identity preservation compared to prompt only baselines and remains consistent across randomized seeds.
The results were implemented into an interactive Gradio based evaluation tool. The study provides a practical method for improving and measuring character consistency in generative AI, supporting future research on fine tuning oriented diffusion models.
