15

I go through phases of being rather paranoid, so that's the reason for this question.

I understand for this site to work there must be a certain amount of tracking and storage. Otherwise there'd be no record of questions, answers, comments, votes, reputation, or badges.

But how does this work and where are the boundaries?

Some questions:

  1. If I conduct a search, is a history of that captured and stored? So, if I do a search today for questions about the PowerBook G4, is a record of that search stored to the database against my name?
  2. If I view another user's profile, is there a record of the fact that I did so?
  3. Is my IP address captured and stored against my name?
  4. Is my location captured and stored against my name?
  5. Is my device captured and stored against my name?
  6. Is my browser captured and stored against my name?
  7. Is a history kept of my profile changes?
  8. Is a record kept of all questions I have viewed?
  9. Is anything else kept not already listed above?

For any of the above, if records are kept, who can view them?

4
  • 1
    Related: stackexchange.com/legal/privacy-policy
    – nohillside
    Commented Feb 25, 2017 at 12:50
  • I'm pretty sure 7. is done and available to moderators. for 1-6 and 8 I think it could be obtained from the log-files but I expect that only SE devs would have access that and I don't know the retention of that info. For 9 it is worth mentioning that Google Analytics is used. That doesn't necessarily mean SE tracks more of you but Google certainly does. Did you consider using Tor?
    – rene
    Commented Feb 25, 2017 at 16:49
  • 2
    @rene if profile text changes are stored then it's only for employees. There's no moderator option for this. The only thing we can see if recent name changes.
    – ChrisF Mod
    Commented Feb 25, 2017 at 17:28
  • Your public user profile is indexed by google and the wayback machine (web.archive.org) Commented Feb 25, 2017 at 18:17

3 Answers 3

15

A lot of this information is already available either on our privacy policy page or scattered through various questions around Meta, but it's good to have it all in one place. I'm just going to answer your direct questions sequentially here, based on what is actually stored in the database and who can access it (see note about server logs at bottom).

  1. I'm not familiar enough with the search system to give you more detailed information about what is stored from search. There is some sort of data about them that is recorded. It wouldn't be accessible to anyone but staff, though.
  2. When you view a question or a user's profile, your view is recorded in the database for ~15 minutes before it gets deleted. During this time, visiting the page again won't cause the view counter to increase. We do not keep a permanent history of the pages you've viewed in the database (although some users have requested we do so).
  3. All the time, in a lot of places. Pretty much any time you do something that changes anything on the site (posting, commenting, editing, etc) your IP is stored with the action. Moderators can see a list of the IP addresses which have been used for your account, but not on which actions they were used.
  4. Only if you type it into your profile. But keep in mind that IP addresses can be geolocated to general areas, so that's basically like storing a location in disguise.
  5. That doesn't sound like anything we'd store, or have any reason to store.
  6. Only if you post it yourself (usually on bug reports on Meta) or if you use the contact page it is inserted into the support ticket automatically, visible only to staff.
  7. Yes, kind of. Moderators can view a history of dates when you changed your about me text, but only staff can fetch the previous text itself. Moderators can also see when you changed your profile picture and what your previous profile pictures were, as well as when you changed your display name and what your previous display names were. Other profile fields are not stored in this history.
  8. See question 2's response.
  9. Pretty broad. I'm sure there are other things we store, but I don't have a list of all the things off the top of my head. As an example, staff can view a history of when you logged into your profile on any site, as well as when you added or removed any of your credentials. It comes to mind because I personally look at this information on a daily basis to help users with login problems and account merges. There's also your personalized prediction data which you can download to see the data for yourself or even disable its collection.

It should also be noted, of course, that we do have server access logs that record every single time you hit a page on our site, which covers a lot of the information above. The Fanatic badge and its siblings are even awarded based off of some of those access logs (which is why we can't "fix" a day you missed manually - it would involve manually editing a server log). So while some information never makes its way into database storage, it is still available in those giant logs of all traffic to our sites (and this is true for pretty much any website you visit on the Internet). The server logs are also where we'd pull some anonymous statistical information, like overall browser usage.

3
  • In my comment on the question I mentioned Google Analytics that is used. Isn't that also part of number 9? And is that possibly combined with other meta data as mentioned in the answer from Shog9?
    – rene
    Commented Feb 25, 2017 at 21:36
  • @animuson We do have traffic logs that retain the "what you viewed" information. We don't use it for anything nor do we have a sane way to even surface it anywhere, but still. That is a fairly permanent internal record.
    – Adam Lear StaffMod
    Commented Feb 25, 2017 at 22:02
  • When you say the browser/device isn't logged, does that mean you are not logging user agents? Most access logs contain user agents which do identify the browser and device. Commented May 6, 2018 at 10:45
13

Animuson already answered your specific questions; I'd just like to point out a few things that tend to get overlooked when folks are thinking about this stuff...

First off, the obvious stuff we know:

  • If you're logged in and posting stuff on the site, that stuff is associated with your account. You don't really need me to tell you that; it's obvious if you look at your profile or anyone else's profile, but I want to state it first because a few of the other answers build on this basic information. You'll also find some stuff in your profile that's clearly based on information which isn't directly displayed: "last seen" for example.
  • If you look in your own profile, you'll also find some information listed there that isn't displayed in the other profiles you might view: the "votes" and "responses" tabs and the "Edit Profile & Settings" tab. Again, this is pretty basic stuff, but worth keeping in mind because it's information that you can verify is tracked without needing to trust my answer.

Now the slightly less obvious stuff that you should still have probably assumed we know:

  • Logs: some basic information on every request made to one of these sites is logged. This is pretty much standard behavior for web sites - it'd be pretty hard to operate a server if it didn't keep track of what it was doing, and we have lots of servers so we have lots of logs. The data here is used for everything from rate-limiting to testing to support - for example, if someone contacts us to report a bug in how a page is rendered, I can look up their access to that page and try to figure out what browser they were using (if they forgot to mention it, which is often the case), if anyone else with that same browser is hitting it, which server they were directed to, etc. There's a tremendous amount of data captured in these logs, so it isn't kept forever. There are also secondary logs that track the outcome of certain requests; for example, if someone tries to post and is blocked by a blacklisted term, I need to know about that if I want to fix false-positives.

  • Event data: this is similar to what is tracked in the logs, but at a bit higher level, and recorded for an entirely different purpose - analyzing how folks use the site. A lot of this could be done using only the logs, but it would be extremely slow and awkward due to the shear volume of data; most of the time, we don't really care what an individual user is doing so much as we want to answer questions like, "how many people post questions after reading How To Ask?" - so Marc Gravell build a system for answering those questions, running A/B tests, etc. (he might be blogging about it in the near future!)

  • Synthesized data: most of what is logged falls into the category of "meta-data" - information about the request. Given enough meta-data, it is possible to construct useful profiles about the folks using our sites, even if they never actually fill out their profiles (see, I told you we'd get back to this). We're pretty circumspect in how much we try to do with this, and you can view what we have or opt-out at any time; for more details on this, check out Kevin Montrose's blog.

Now, with all that out of the way...

If you're sufficiently paranoid, you're asking the wrong questions.

We could collect a hell of a lot more information about you than we actually do; we could associate a lot more information with your account than we actually do. We could keep it for longer, disseminate it to more people, be more generally irresponsible about how we store it... Any time you visit a website, your browser is sending a fair bit of details along with your request, so either you trust the site to be responsible with them... Or you should take steps to reduce or alter what is sent.

This is our privacy policy - it states how we've committed to using any data we collect about you as a subscriber. You should read it carefully, and decide for yourself how much information you want to trust us with.

10

I'm "only" a moderator so I don't know the answers to all your questions, but here are the things that I do know.

  1. Is my IP address captured and stored against my name?

Yes.

  1. Is my location captured and stored against my name?

Only in as much as it can be ascertained from your IP address. So if your (or your employer's) ISP only returns the fact that you are in the UK, or USA, or where ever, then that's all that we can know. Obviously if you tell us where you live in your profile, then it will be tracked :)

  1. Is a history kept of my profile changes?

Some. Recent name changes are kept for a period. I don't know about anything else.

  1. Is a record kept of all questions I have viewed?

Not to my knowledge.

3
  • Thank you very much the infos. Side note to 1, 2, and 8: the SO has various hit rate/time unit limits. It is mainly a precaution against spammers, bots and dos attacks. It may be also intended to harden to crawl the site content. To do that, at least a per-user or per-ip view count should be somewhere stored (how long, or is it permanently stored, is to us unknown).
    – peterh
    Commented Feb 25, 2017 at 17:51
  • @peterh that is most likely done at the HA-proxy by IP-address
    – rene
    Commented Feb 25, 2017 at 17:58
  • 1
    @peterh - I think that is at the IP address level as we do get legitimate complaints from people who are blocked by them just because they share IP addresses with spammers.
    – ChrisF Mod
    Commented Feb 25, 2017 at 18:00

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .