realutils.tagging.idolsankaku
- Overview:
This module provides utilities for image tagging using IdolSankaku taggers. It includes functions for loading models, processing images, and extracting tags.
The module is inspired by the SmilingWolf/wd-tagger project on Hugging Face.
Overview of IdolSankaku (NSFW Warning!!!)
This is an overall benchmark of all the idolsankaku models:
convert_idolsankaku_emb_to_prediction
- realutils.tagging.idolsankaku.convert_idolsankaku_emb_to_prediction(emb: ndarray, model_name: str = 'SwinV2', general_threshold: float = 0.35, general_mcut_enabled: bool = False, character_threshold: float = 0.85, character_mcut_enabled: bool = False, no_underline: bool = False, drop_overlap: bool = False, fmt: Any = ('rating', 'general', 'character'))[source]
Convert idolsankaku embedding to understandable prediction result. This function can process both single embeddings (1-dimensional array) and batches of embeddings (2-dimensional array).
- Parameters:
emb (numpy.ndarray) – The extracted embedding(s). Can be either a 1-dim array for single image or 2-dim array for batch processing
model_name (str) – Name of the idolsankaku model to use for prediction
general_threshold (float) – Confidence threshold for general tags (0.0 to 1.0)
general_mcut_enabled (bool) – Enable MCut thresholding for general tags to improve prediction quality
character_threshold (float) – Confidence threshold for character tags (0.0 to 1.0)
character_mcut_enabled (bool) – Enable MCut thresholding for character tags to improve prediction quality
no_underline (bool) – Replace underscores with spaces in tag names for better readability
drop_overlap (bool) – Remove overlapping tags to reduce redundancy
fmt (Any) – Specify return format structure for predictions, default is
('rating', 'general', 'character')
.
- Returns:
For single embeddings: prediction result based on fmt. For batches: list of prediction results.
For batch processing (2-dim input), returns a list where each element corresponds to one embedding’s predictions in the same format as single embedding output.
- Example:
>>> import os >>> import numpy as np >>> from realutils.tagging import get_idolsankaku_tags, convert_idolsankaku_emb_to_prediction >>> >>> # extract the feature embedding, shape: (W, ) >>> embedding = get_idolsankaku_tags('skadi.jpg', fmt='embedding') >>> >>> # convert to understandable result >>> rating, general, character = convert_idolsankaku_emb_to_prediction(embedding) >>> # these 3 dicts will be the same as that returned by `get_idolsankaku_tags('skadi.jpg')` >>> >>> # Batch processing, shape: (B, W) >>> embeddings = np.stack([ ... get_idolsankaku_tags('img1.jpg', fmt='embedding'), ... get_idolsankaku_tags('img2.jpg', fmt='embedding'), ... ]) >>> # results will be a list of (rating, general, character) tuples >>> results = convert_idolsankaku_emb_to_prediction(embeddings)