Three Things I Learned About Export Job Performance From My Uncle Paulo

1. “Realize you can try to do it all…but you’ll do it slowly, very slowly.”

It’s understandable that there may be a desire to export every single dimension, for every CRF in your study, for all time. However, if that export takes 15 hours, then a job starting during off-hours may impact users the next business day, or worse, users in an alternate time-zone who may just be starting their work day.

It’s better to first get a sense of how long a small subset (e.g. reducing the temporal scope to a month) of data will take, manually, and take the elapsed time for the export to complete and multiply by the scope quotient (i.e. estimated total scope divided by reduced scope) to figure out a rough estimate of how long your export may take.

2. “Those temporal scope fields are your friend.”

If you don’t choose a temporal scope for your dataset, you will get data for a wider timeframe than you’re likely to be alive. This is may be fine for a dataset with a few dimensions, or if you happen to be looking for specific data you aren’t sure you may capture accurately with too narrow of a time scope.

However, if you KNOW what the scope is, you should specify this. This is more important with very large dataset. Breaking the dataset down into smaller scoped datasets will give you the flexibility of manageable chunks of data to schedule for export.

3. “Let your jobs breathe.”

You wouldn’t schedule two appointments, one after the other, if you didn’t know what time the first appointment would end, would you? The same idea applies to scheduled export jobs. While there’s nothing inherently wrong with scheduling one job after the other, you can only get away with this when you have a clear sense of when the first job may end. If you don’t have any idea when the first job may end, while the first job may complete successfully, the second will not if it is overrun by the first. When in doubt, give your jobs enough time to complete, by spacing them out.

Remember, it’s better to measure twice, and cut once. (My uncle Paulo didn’t actually say this last one.)

– Tope Oluwole

Leave a Reply