A tuning-free framework for precise region-specific image manipulation based on refocusing cross-attention.