Dan Andreescu
2013-11-21 15:39:38 UTC
Dear Wikimetrics users,
I've just deployed asynchronous cohort upload. This is feature #818:
https://mingle.corp.wikimedia.org/projects/analytics/cards/818 and
basically allows you to upload larger cohorts because validation is
happening behind the scenes. I'll go over how the new functionality works
here, and will rely on one of you to point me to the appropriate on-wiki
place to update documentation.
So basically, visiting /cohorts and clicking "Upload Cohort" works as
before. But once you click "Upload CSV", your form is validated,
processed, and you're taken back to the cohorts page. Your new cohort is
immediately created but is not yet validated. While it validates, you'll
see the validation status and have a few options:
* Remove Cohort. This is destructive and will remove this cohort from your
list. Use this in case you made a mistake, uploaded the wrong file, etc.
* Validate Again. This will run validation again. One possible use for it
is, let's say you upload a cohort with some *very* newly registered users.
And because of replication lag to the labsdb databases, most of them come
up invalid. You can then run validation again.
* Refresh. This just refreshes the status of the validation and will
update the counts that show up below.
You will not have the "Create Report" option until validation is done. And
when you do create a report, only valid users will be considered and used
in the output.
One caveat. Validation is still slow. And the time limit for the
asynchronous task is set to 1 hour. I have some ideas for making this
faster by batching, and I can increase the time limit per task (but that
has other repercussions). For now, just keep in mind that the theoretical
maximum cohort size you should upload is roughly 18,000 users. I would
love some feedback about whether it's ok to increase the time limit or if
people want me to focus on making validation faster.
Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131121/2bb38fbf/attachment.html>
I've just deployed asynchronous cohort upload. This is feature #818:
https://mingle.corp.wikimedia.org/projects/analytics/cards/818 and
basically allows you to upload larger cohorts because validation is
happening behind the scenes. I'll go over how the new functionality works
here, and will rely on one of you to point me to the appropriate on-wiki
place to update documentation.
So basically, visiting /cohorts and clicking "Upload Cohort" works as
before. But once you click "Upload CSV", your form is validated,
processed, and you're taken back to the cohorts page. Your new cohort is
immediately created but is not yet validated. While it validates, you'll
see the validation status and have a few options:
* Remove Cohort. This is destructive and will remove this cohort from your
list. Use this in case you made a mistake, uploaded the wrong file, etc.
* Validate Again. This will run validation again. One possible use for it
is, let's say you upload a cohort with some *very* newly registered users.
And because of replication lag to the labsdb databases, most of them come
up invalid. You can then run validation again.
* Refresh. This just refreshes the status of the validation and will
update the counts that show up below.
You will not have the "Create Report" option until validation is done. And
when you do create a report, only valid users will be considered and used
in the output.
One caveat. Validation is still slow. And the time limit for the
asynchronous task is set to 1 hour. I have some ideas for making this
faster by batching, and I can increase the time limit per task (but that
has other repercussions). For now, just keep in mind that the theoretical
maximum cohort size you should upload is roughly 18,000 users. I would
love some feedback about whether it's ok to increase the time limit or if
people want me to focus on making validation faster.
Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131121/2bb38fbf/attachment.html>