1. Both cases grab the image and run it. Better per-layer caching (including very aggressive caching of common layers) is coming soon, so stay tuned.
2. No current equivalent, though there are thoughts on exposing more scaling control knobs (e.g. max-instances, min-instances). Max is easy, min is harder because of the cost implications. GAE was billed on "instance hours" but Run is CPU time, so if you go "min-instances=1" you're paying for a VM. Something like Run on GKE (where you're already paying for the compute) probably makes more sense to expose these controls.
3. Yes, though since Run can be multi-concurrent, for certain (most?) load profiles, you're going to have way fewer cold starts because the instance is already handling requests.