Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend metric expiration date #9710

Merged
merged 1 commit into from
Jul 8, 2024
Merged

Extend metric expiration date #9710

merged 1 commit into from
Jul 8, 2024

Conversation

mcleinman
Copy link
Collaborator

On draft until data review is complete


Description

In #9057, we added several new metrics to measure connection health. From that PR: We're adding 9 new metrics, and removing a bunch as well. We think that some subset of these 9 will be most useful, but aren't quite sure. Thus, these metrics all will live for about 6 months, and we'll then evaluate the data and decide which ones to keep in perpetuity.

Unfortunately, according to our Product Manager, we're still figuring out how to best analyze this new data in Looker, and these metrics expire in 2 weeks. This PR adds another 9 months to continue the experiment of figuring out which subset should get an expiration of "never", and which should be removed.

Reference

N/A

Checklist

  • My code follows the style guidelines for this project
  • I have not added any packages that contain high risk or unknown licenses (GPL, LGPL, MPL, etc. consult with DevOps if in question)
  • I have performed a self review of my own code
  • I have commented my code PARTICULARLY in hard to understand areas
  • I have added thorough tests where needed
@mcleinman
Copy link
Collaborator Author

Request for data collection review form

(attention @travis79 )

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?

These are a handful of metrics to learn more about how often VPN sessions have connectivity issues. How many times in a VPN session do users have connectivity issues? How long do those issues last for?

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

We do not have great insight into some of basic questions about our users: How often are they having issues? What sort of issues are they? How long do users keep the VPN active for?

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Carefully considered which metrics were needed. Additionally had some prior metrics (which are removed in this PR), but they were not robust enough to answer the questions above.

  1. Can current instrumentation answer these questions?

No, unfortunately. See question 3 for more details.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.

Measurement Name Measurement Description Data Collection Category Tracking Bug #
changed_to_no_signal Event that is recorded when connection status changes to no signal technical VPN-5860
changed_to_unstable Event that is recorded when connection status changes to unstable technical VPN-5860
changed_to_stable Event that is recorded when connection status changes to stable technical VPN-5860
changed_to_pending Event that is recorded when connection status changes to pending technical VPN-6406
no_signal_count Count of health checks which result in status no signal technical VPN-5860
unstable_count Count of health checks which result in status unstable technical VPN-5860
stable_count Count of health checks which result in status stable technical VPN-5860
pending_count Count of health checks which result in status is pending (bad internet) technical VPN-6406
no_signal_time Time distribution of how long connectivity status stays in no signal technical VPN-5860
unstable_time Time distribution of how long connectivity status stays in unstable technical VPN-5860
stable_time Time distribution of how long connectivity status stays in stable technical VPN-5860
pending_time Time distribution of how long connectivity status stays in pending (bad internet) technical VPN-6406
  1. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

This collection is documented in the Glean Dictionary at https://dictionary.telemetry.mozilla.org/

  1. How long will this data be collected? Choose one of the following:

I want this data to be collected for 6 months initially (potentially renewable).
(It's likely that after reviewing 6 months of data, a subset of these metrics will be collected in perpetuity. And some will be removed)

  1. What populations will you measure?

No filters - all channels, countries, and locales, unless the user has opted out of data collection on that device.

  1. If this data collection is default on, what is the opt-out mechanism for users?

When launching the app for the first time, a user is given an option for whether to allow data collection. After this initial set up screen, a user can always toggle data collection permissions in the System Preferences screen.

  1. Please provide a general description of how you will analyze this data.

Dashboards that product managers and others will consult on a regular basis.

  1. Where do you intend to share the results of your analysis?

Within the Mozilla VPN team.

  1. Is there a third-party tool (i.e. not Glean or Telemetry) that you are proposing to use for this data collection?

No

Copy link
Member

@travis79 travis79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request for data collection review form

(attention @travis79 )

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

1. What questions will you answer with this data?

These are a handful of metrics to learn more about how often VPN sessions have connectivity issues. How many times in a VPN session do users have connectivity issues? How long do those issues last for?

2. Why does Mozilla need to answer these questions?  Are there benefits for users? Do we need this information to address product or business requirements?

We do not have great insight into some of basic questions about our users: How often are they having issues? What sort of issues are they? How long do users keep the VPN active for?

3. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Carefully considered which metrics were needed. Additionally had some prior metrics (which are removed in this PR), but they were not robust enough to answer the questions above.

4. Can current instrumentation answer these questions?

No, unfortunately. See question 3 for more details.

5. List all proposed measurements and indicate the category of data collection for each measurement, using the [Firefox data collection categories](https://wiki.mozilla.org/Data_Collection) found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.
Measurement Name Measurement Description Data Collection Category Tracking Bug #
changed_to_no_signal Event that is recorded when connection status changes to no signal technical VPN-5860
changed_to_unstable Event that is recorded when connection status changes to unstable technical VPN-5860
changed_to_stable Event that is recorded when connection status changes to stable technical VPN-5860
changed_to_pending Event that is recorded when connection status changes to pending technical VPN-6406
no_signal_count Count of health checks which result in status no signal technical VPN-5860
unstable_count Count of health checks which result in status unstable technical VPN-5860
stable_count Count of health checks which result in status stable technical VPN-5860
pending_count Count of health checks which result in status is pending (bad internet) technical VPN-6406
no_signal_time Time distribution of how long connectivity status stays in no signal technical VPN-5860
unstable_time Time distribution of how long connectivity status stays in unstable technical VPN-5860
stable_time Time distribution of how long connectivity status stays in stable technical VPN-5860
pending_time Time distribution of how long connectivity status stays in pending (bad internet) technical VPN-6406

6. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

This collection is documented in the Glean Dictionary at https://dictionary.telemetry.mozilla.org/

7. How long will this data be collected?  Choose one of the following:

I want this data to be collected for 6 months initially (potentially renewable). (It's likely that after reviewing 6 months of data, a subset of these metrics will be collected in perpetuity. And some will be removed)

8. What populations will you measure?

No filters - all channels, countries, and locales, unless the user has opted out of data collection on that device.

9. If this data collection is default on, what is the opt-out mechanism for users?

When launching the app for the first time, a user is given an option for whether to allow data collection. After this initial set up screen, a user can always toggle data collection permissions in the System Preferences screen.

10. Please provide a general description of how you will analyze this data.

Dashboards that product managers and others will consult on a regular basis.

11. Where do you intend to share the results of your analysis?

Within the Mozilla VPN team.

12. Is there a third-party tool (i.e. not Glean or Telemetry) that you are proposing to use for this data collection?

No

Data Review

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, through the metrics.yaml file and the Glean Dictionary.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, through the data preferences in the application settings.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

N/A, collection to end or be renewed by 2025-04-15

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical data

  1. Is the data collection request for default-on or default-off?

Default-on

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does the data collection use a third-party collection tool?

No

Result

data-review+

@mcleinman mcleinman merged commit e008aa5 into main Jul 8, 2024
113 checks passed
@mcleinman mcleinman deleted the extend_metrics branch July 8, 2024 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants