Discussion:
[Wikimetrics] The threshold and time to threshold metrics
Steven Walling
2013-10-25 21:07:43 UTC
Permalink
Hey all,

I used the threshold metric for the first time yesterday. First off, thanks
for adding it! Dario tells me it was brand new as of yesterday? He also
said it needs vetting?

One piece of feedback: combining threshold and 'time to threshold' seems to
make things more confusing. For example, when you select sum as an output,
you also get the sum of the time to threshold. That result -- like
"time_to_threshold":
92.7864 -- seems to be simply the sum of hours for the members of the
cohort. Knowing that it took the cohort a combined 92 hours to reach the
threshold isn't very actionable.
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/4b6493de/attachment.html>
Diederik van Liere
2013-10-25 21:16:50 UTC
Permalink
Post by Steven Walling
Hey all,
I used the threshold metric for the first time yesterday. First off,
thanks for adding it! Dario tells me it was brand new as of yesterday? He
also said it needs vetting?
Yes it needs extra vetting!
One piece of feedback: combining threshold and 'time to threshold' seems
to make things more confusing. For example, when you select sum as an
92.7864 -- seems to be simply the sum of hours for the members of the
cohort. Knowing that it took the cohort a combined 92 hours to reach the
threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/f3b3a7cc/attachment.html>
Steven Walling
2013-10-25 21:43:57 UTC
Permalink
On Fri, Oct 25, 2013 at 2:16 PM, Diederik van Liere <dvanliere at wikimedia.org
Post by Steven Walling
One piece of feedback: combining threshold and 'time to threshold' seems
Post by Steven Walling
to make things more confusing. For example, when you select sum as an
92.7864 -- seems to be simply the sum of hours for the members of the
cohort. Knowing that it took the cohort a combined 92 hours to reach the
threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
Separating is probably the simplest thing to do, but you could also just
remove Sum as an output for time to threshold. The two metrics can make
sense together, if you check out a result like:

"Average": {
"threshold": 0.1735,
"time_to_threshold": 0.7863
}
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/01a02884/attachment.html>
Dan Andreescu
2013-10-25 21:53:20 UTC
Permalink
Post by Steven Walling
On Fri, Oct 25, 2013 at 2:16 PM, Diederik van Liere <
Post by Steven Walling
One piece of feedback: combining threshold and 'time to threshold' seems
Post by Steven Walling
to make things more confusing. For example, when you select sum as an
92.7864 -- seems to be simply the sum of hours for the members of the
cohort. Knowing that it took the cohort a combined 92 hours to reach the
threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
Separating is probably the simplest thing to do, but you could also just
remove Sum as an output for time to threshold. The two metrics can make
"Average": {
"threshold": 0.1735,
"time_to_threshold": 0.7863
}
Hi, so there's a brief story behind this. Stefan caught this problem
before we deployed and I made the call to push it out without fixing. In
wikimetrics parlance, "threshold" and "time_to_threshold" are submetrics of
the Threshold metric. I think the right solution here is to make a map
from Aggregations to Submetrics. This would describe which aggregate is
allowed for which submetric, and we could display this mapping along with
an explanation on a page under /reports

The reason we chose to compute the metrics together is that if you think
about it:

threshold = time_to_threshold is not null

So resource wise, you're basically getting it for free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/8fa29dcb/attachment.html>
Dario Taraborelli
2013-10-25 23:24:04 UTC
Permalink
Steven's point goes back to a suggestion I made a while ago: we need to avoid a many-to-many relation between metrics and aggregators.

Each metric should return just values of one type (e.g. no mixing of booleans and integers, like threshold and time to threshold) and we should specify for each metric : (1) what the expected type of the output is and (2) what aggregators are appropriate for that type.

Practically, we can group metrics into categories depending on the attribute they compute:
• binary attributes (e.g. "got reverted", "got blocked", "is productive", "hit threshold")
• counts ("bytes added", "pages created", "time to threshold")
• rates ("revert rate")

Each of these attributes will have a canonical type:
• boolean for binary attributes
• integer for counts
• float for rates

We can then specify what aggregator is valid as a function of the metric category/type.

How does that sound?

Dario
Post by Steven Walling
Hey all,
I used the threshold metric for the first time yesterday. First off, thanks for adding it! Dario tells me it was brand new as of yesterday? He also said it needs vetting?
Yes it needs extra vetting!
One piece of feedback: combining threshold and 'time to threshold' seems to make things more confusing. For example, when you select sum as an output, you also get the sum of the time to threshold. That result -- like "time_to_threshold": 92.7864 -- seems to be simply the sum of hours for the members of the cohort. Knowing that it took the cohort a combined 92 hours to reach the threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/618b0977/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/618b0977/attachment-0001.pgp>
Dan Andreescu
2013-10-26 03:20:15 UTC
Permalink
That works for me, I'm just curious what changed your mind (from the
definition of #699 we hammered out together). It really is no big deal
either way though.
Post by Dario Taraborelli
Steven's point goes back to a suggestion I made a while ago: we need to
avoid a many-to-many relation between metrics and aggregators.
Each metric should return just values of one type (e.g. no mixing of
booleans and integers, like threshold and time to threshold) and we should
specify for each metric : (1) what the expected type of the output is and
(2) what aggregators are appropriate for that type.
Practically, we can group metrics into *categories* depending on the
• binary attributes (e.g. "got reverted", "got blocked", "is productive",
"hit threshold")
• counts ("bytes added", "pages created", "time to threshold")
• rates ("revert rate")
• boolean for binary attributes
• integer for counts
• float for rates
We can then specify what *aggregator* is valid as a function of the
metric category/type.
How does that sound?
Dario
On Oct 25, 2013, at 2:16 PM, Diederik van Liere <dvanliere at wikimedia.org<javascript:_e({}, 'cvml', 'dvanliere at wikimedia.org');>>
On Fri, Oct 25, 2013 at 2:07 PM, Steven Walling <swalling at wikimedia.org<javascript:_e({}, 'cvml', 'swalling at wikimedia.org');>
Post by Steven Walling
Hey all,
I used the threshold metric for the first time yesterday. First off,
thanks for adding it! Dario tells me it was brand new as of yesterday? He
also said it needs vetting?
Yes it needs extra vetting!
One piece of feedback: combining threshold and 'time to threshold' seems
to make things more confusing. For example, when you select sum as an
92.7864 -- seems to be simply the sum of hours for the members of the
cohort. Knowing that it took the cohort a combined 92 hours to reach the
threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org <javascript:_e({}, 'cvml',
'Wikimetrics at lists.wikimedia.org');>
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org <javascript:_e({}, 'cvml',
'Wikimetrics at lists.wikimedia.org');>
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131025/ff2622e6/attachment.html>
Dario Taraborelli
2013-10-26 15:10:11 UTC
Permalink
I did see the benefits of your suggestion of TTT as a submetric but I hadn't thought through the usability implications when it comes to aggregators. As far as I know this is the only metric with a submetric attached to it among those implemented so far, right?
That works for me, I'm just curious what changed your mind (from the definition of #699 we hammered out together). It really is no big deal either way though.
Steven's point goes back to a suggestion I made a while ago: we need to avoid a many-to-many relation between metrics and aggregators.
Each metric should return just values of one type (e.g. no mixing of booleans and integers, like threshold and time to threshold) and we should specify for each metric : (1) what the expected type of the output is and (2) what aggregators are appropriate for that type.
• binary attributes (e.g. "got reverted", "got blocked", "is productive", "hit threshold")
• counts ("bytes added", "pages created", "time to threshold")
• rates ("revert rate")
• boolean for binary attributes
• integer for counts
• float for rates
We can then specify what aggregator is valid as a function of the metric category/type.
How does that sound?
Dario
Post by Steven Walling
Hey all,
I used the threshold metric for the first time yesterday. First off, thanks for adding it! Dario tells me it was brand new as of yesterday? He also said it needs vetting?
Yes it needs extra vetting!
One piece of feedback: combining threshold and 'time to threshold' seems to make things more confusing. For example, when you select sum as an output, you also get the sum of the time to threshold. That result -- like "time_to_threshold": 92.7864 -- seems to be simply the sum of hours for the members of the cohort. Knowing that it took the cohort a combined 92 hours to reach the threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131026/8614c1df/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131026/8614c1df/attachment.pgp>
Dario Taraborelli
2013-10-26 15:22:23 UTC
Permalink
I was also thinking that, while both approaches could work for the end user as long as there's a UI, handling aggregators for submetrics will be a pain when we turn Wikimetrics into an API that can be queried via HTTP. The "one metric, one response type, one aggregator" approach should make things much more straightforward.
Post by Dario Taraborelli
I did see the benefits of your suggestion of TTT as a submetric but I hadn't thought through the usability implications when it comes to aggregators. As far as I know this is the only metric with a submetric attached to it among those implemented so far, right?
That works for me, I'm just curious what changed your mind (from the definition of #699 we hammered out together). It really is no big deal either way though.
Steven's point goes back to a suggestion I made a while ago: we need to avoid a many-to-many relation between metrics and aggregators.
Each metric should return just values of one type (e.g. no mixing of booleans and integers, like threshold and time to threshold) and we should specify for each metric : (1) what the expected type of the output is and (2) what aggregators are appropriate for that type.
• binary attributes (e.g. "got reverted", "got blocked", "is productive", "hit threshold")
• counts ("bytes added", "pages created", "time to threshold")
• rates ("revert rate")
• boolean for binary attributes
• integer for counts
• float for rates
We can then specify what aggregator is valid as a function of the metric category/type.
How does that sound?
Dario
Post by Steven Walling
Hey all,
I used the threshold metric for the first time yesterday. First off, thanks for adding it! Dario tells me it was brand new as of yesterday? He also said it needs vetting?
Yes it needs extra vetting!
One piece of feedback: combining threshold and 'time to threshold' seems to make things more confusing. For example, when you select sum as an output, you also get the sum of the time to threshold. That result -- like "time_to_threshold": 92.7864 -- seems to be simply the sum of hours for the members of the cohort. Knowing that it took the cohort a combined 92 hours to reach the threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131026/f46373e3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131026/f46373e3/attachment.pgp>
Dan Andreescu
2013-10-27 18:20:00 UTC
Permalink
Got it, that's perfectly reasonable. And like I said, not a big deal to
change any of this. Ok, so right now the following metrics have
sub-metrics:

bytes-added: net_sum, absolute_sum, positive_only_sum, negative_only_sum
survival: survived, censored
threshold: threshold, time_to_threshold, censored

I thought about this and figured out an alternative that may make sense.
We can keep censored as it's not as much a sub-metric but an informational
thing. And we can keep the bytes_added submetrics together because they'll
always aggregate the same way. But in the case of threshold, when there
are disparate data types returned, we can just return two results (so you'd
have two rows on the reports page. This is probably the trickiest way to
do it for me, but it seems the cleanest for the user. Thoughts?

Dan


On Sat, Oct 26, 2013 at 11:22 AM, Dario Taraborelli <
Post by Dario Taraborelli
I was also thinking that, while both approaches could work for the end
user as long as there's a UI, handling aggregators for submetrics will be a
pain when we turn Wikimetrics into an API that can be queried via HTTP. The
"one metric, one response type, one aggregator" approach should make things
much more straightforward.
I did see the benefits of your suggestion of TTT as a submetric but I
hadn't thought through the usability implications when it comes to
aggregators. As far as I know this is the only metric with a submetric
attached to it among those implemented so far, right?
That works for me, I'm just curious what changed your mind (from the
definition of #699 we hammered out together). It really is no big deal
either way though.
Post by Dario Taraborelli
Steven's point goes back to a suggestion I made a while ago: we need to
avoid a many-to-many relation between metrics and aggregators.
Each metric should return just values of one type (e.g. no mixing of
booleans and integers, like threshold and time to threshold) and we should
specify for each metric : (1) what the expected type of the output is and
(2) what aggregators are appropriate for that type.
Practically, we can group metrics into *categories* depending on the
• binary attributes (e.g. "got reverted", "got blocked", "is productive",
"hit threshold")
• counts ("bytes added", "pages created", "time to threshold")
• rates ("revert rate")
• boolean for binary attributes
• integer for counts
• float for rates
We can then specify what *aggregator* is valid as a function of the
metric category/type.
How does that sound?
Dario
Post by Steven Walling
Hey all,
I used the threshold metric for the first time yesterday. First off,
thanks for adding it! Dario tells me it was brand new as of yesterday? He
also said it needs vetting?
Yes it needs extra vetting!
One piece of feedback: combining threshold and 'time to threshold' seems
to make things more confusing. For example, when you select sum as an
92.7864 -- seems to be simply the sum of hours for the members of the
cohort. Knowing that it took the cohort a combined 92 hours to reach the
threshold isn't very actionable.
So.......what are you proposing? separating it as two separate metrics?
--
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
_______________________________________________
Wikimetrics mailing list
Wikimetrics at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimetrics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/wikimetrics/attachments/20131027/ddd4bd16/attachment.html>
Loading...