During my time at Flickr, I was mainly in charge of AutoUploadr. AutoUploadr was implemented by Flickr first, then by Yahoo Mail, Messenger, and Tumblr. It has processed millions - probably billions - of photos and videos. This is some of my favorite tech I've worked on. The goal of AutoUploadr is to enable users to upload their entire photo library (or only photos the user allows) to Flickr (or any part of the Yahoo ecosystem). This might sound simple at first, but it is quite the technical challenge.
There is too much to go into detail here. However, I’ll go through the high-level process of uploading assets and some of the technical challenges.
Stage the asset. We need to create our own internal copy of the asset before we can start manipulating it anyway. This avoids the question that is: What happens if you start uploading an asset and the user deletes the master copy?
We need to run a check to see if the asset is a duplicate. To do this, we create a checksum of the asset using SHA-256 and check with the backend if that checksum already exists.
For every asset, we need to check it’s size. Since photos and videos can be large in size, we want to be able to chunk them into smaller pieces. (more on this later)
We need to chunk the asset by an appropriate amount - which, in our case, we found to be 3MB.
If an asset is a really large size, we may need to transcode it on the client to have it be realistically uploadable. (Writing the transcoding pipeline in a generic way proved to be a monumental pain but I learned a lot in the process - looking at you, H.264)
Upload the chunks until complete.
Uploading assets in the background proved to be a challenge. iOS is a bit strange with allowing your app to perform tasks while your app is in the background. The system decides for itself when it can allocate resources for your app - and for an indefinite amount of time. To get around this, we would send a silent push notification to the app every 15 minutes to wake it up in the background to continue seamlessly uploading assets. However, iOS is smart about these kinds of things and seemed to take notice of this abusive behavior. We can’t prove that that actually happened, but we did see some of our clients get reduced time to perform tasks in the background. This is another case where chunking assets proved to be useful. Uploading a 15MB asset in the background would cause the entire system to fail - it would always get stuck here and never successfully upload. This is because before it had time to upload, the system would terminate our apps background time. We found we could always make incremental progress uploading 3MB chunks at a time.