Advanced General AI would know, that we want to remove the cars, and it would know how to automatically fill the gaps with convincing backgrounds, with correct perspective, perfectly lit, perfectly sized in respect to surrounding image areas, with perfectly blending textures. Luckily we are far from that, at least for the next few years 😁
But really, if you think about it, these are real hard problems, and not only computationally. Currently, we train generative AIs with images. That way they “learn” features of real images. Given enough training images, you can find similarities between the surroundings of the gaps you want to fill, with content of similar images. For various criteria of similarity.
Overly simplified, one could say that Photoshop’s “Content-aware Fill” draws from areas of your own image, while a generative AI would draw from the trillions of “learned” images.
Still, that’s not enough to make judgements about what a user will want to remove. You’d need a prompt like “remove all cars” for that.
An always-online AI could, of course, collect the prompts of its users, and thereby “learn”, that users tend to remove cars from scenic images. Sounds like an orwellian nightmare? Yes, but that’s what is actually done. What did you expect? 😄