If you can run things locally, the two mailn ways to do this are with inpainting (drawing which areas to use for the current prompt and only filling in that), and regional prompting (aka "forge couple" in reForge) that splits the image into sections (so you can say "this part of the prompt applies to all the image, this part to the upper half, and this part to the lower left quadrant).
Basically it's possible to work the system, but it's not straight forward.
[@Background Pony \#09DE](/forums/dis/topics/general-questions?post_id=838#post_838)
Since I'm already here... Raw number of images isn't really a concern, storage is fairly cheap. Transfer is the main cost factor there; site would have to scale close to derpi levels of _user numbers and activity_ for there to be a "problem". Which IMO would a happy problem.