This has happened for a long time, but the move to a CDN could help fix this.
1. Load word-tied image size with dom, defining width/height or aspect-ratio when page loads.
2. Have low-res, compressed images as thumbnails. Mobile should use these unless full image is clicked, or load first, followed by then loading the high-res version absolutely positioned over the low-res
This 'bug' also can cause learning new words to not show listen / example buttons, if the image takes too long to load, requiring going to next card and then back for it to resize.