65

This is the announcement post of the Stack Exchange Data Explorer. Its answers were meant to iron out the initial glitches 6 to 8 years ago.

If you have found a bug or need support today with any of SEDEs features, please ask a new question and tag it with

What does it do?

Allows you to run arbitrary SQL queries on the data dumps, and share those queries with your friends.

Technical details

It is hosted on the Stack Exchange infrastructure , the database is hosted on the same SQL Server version all other sites are hosted on. You can use most of the TSQL commands, even create temporary objects (but not User Defined Functions).

The controls used are jQuery based, the grid is flexigrid SlickGrid.

All query results are cached, so sharing a link to a query will not result in the server going in to a tailspin.

The query edit control is a slightly modified CodeMirror control, I also am considering bespin cause the inter iframe communication is nasty complicated. It gives up nice syntax highlighting for queries.

Parameters

There is support for parameterized queries, to get it to work name your params like ##this##. For example have a look here.

Query name and description

A query can have a name and description, to name it lead with a comment, to describe it continue commenting, for example:

-- This is my query name 
-- This is my description
-- I can span multiple lines

SELECT 1

Anonymous users can initially name a query, but can not change the names of any existing queries.

Tutorial

There's a not so well-known tutorial which helps you getting started in writing your own SQL queries.

Logging in

You can use your openid provider to log in, at the moment once logged in we will track the list of queries you execute (giving you basic history support)

Also if there are any params named UserId they will be populated automatically if your EmailHash matches the one on the site being queried.

Magic columns

If you alias an id column with as [Post Link] it will automatically create a link in the result set to the parent site. Similarly if you alias an id column with as [User Link] it will display a link to the user page. select top 10 * from Posts will show you how tags are done. (Magic columns for images is planned) example (most controversial posts: https://data.stackexchange.com/stackoverflow/q/1 )

So what are you waiting for, check it out at https://data.stackexchange.com

Any feedback would be greatly appreciated. I hope to have the login section working fairly soon and allow voting on the best queries and commenting.

Featured Queries

At the moment, admins have permission to feature interesting queries on this page https://data.stackexchange.com/stackoverflow/queries, I am looking for better ways to manage the huge query list that is building up.

5
  • Where is currently the documentation? Ī̲’d add one thing to it (namely, that a string literal containing non-ASCII characters must be written as N' … '). Commented Aug 26, 2015 at 13:41
  • @IncnisMrsi That's standard T-SQL syntax. Searching for T-SQL things will bring up lots of useful information.
    – wizzwizz4
    Commented Sep 11, 2017 at 15:41
  • I am closing this question because answers on a nine-year-old post are not the best way to request changes. If you have a feature, support request, etc for SEDE, ask a new question with the data-explorer tag. Thanks. Commented Aug 18, 2019 at 3:27
  • @MonicaCellio Just because people are answering this question with bugs doesn't mean that this question is off-topic for this site. A lock would have been better. Commented Aug 18, 2019 at 3:39
  • @SonictheAnonymousWizHog it is unfortunate that not even mods have a free-form close reason available except under "off-topic". The question isn't off-topic, but there's no other way to close it. A lock would prevent edits, so if, for example, one of the links were to go bad in the future, the community couldn't fix it. I don't think waffles will mind, but if I'm wrong he can let us know. Commented Aug 18, 2019 at 3:43

8 Answers 8

16
+100

Once you have got the login section working, how about having the ability to either work out or store your user ids for the sites with your login. Queries could then be written and stored similar to the parameterised queries, but with a special token that would automatically get replaced with your user id.

eg. Instead of (as I have seen in several stored queries):

DECLARE @UserId int = 1 --<-- Change this by your user ID.

you would write:

DECLARE @UserId int = ##USER_ID##

This would automatically substitue the current user id into the query upon execution.

5
  • 1
    totally, I already set up the indexes to do that... login is working now btw
    – waffles
    Commented May 20, 2010 at 22:50
  • @waffles: so it is! I've just logged in, but then got an error page when I clicked on my username. Commented May 20, 2010 at 23:22
  • OK I sorted out a fair bit of bugginess there, can you try again? at least I should be able to see a log of the failure.
    – waffles
    Commented May 25, 2010 at 2:11
  • Note ... User ID is filled in automatically
    – waffles
    Commented May 25, 2010 at 2:12
  • @waffles: it is not giving me the error now. Commented May 25, 2010 at 23:24
7

heart!
(source: citypages.com)

I was trying to find an image in google, that I knew I had attached to a meta post a while ago. I utterly failed to find it using a keyword search, but I was able to find it on the cloudapp sandbox by searching with the query select Id, ParentId, Body from Posts where OwnerUserId = 132636 and Body like '%img%';

(For the curious, the post was here -- now deleted, so searching on meta itself also failed to find it.)

1
  • 1
    This post still needs one more vote to undelete :-D
    – Pollyanna
    Commented May 25, 2010 at 14:25
5

It choked on a simple select * from votes query.

Here's an old version of the procedure I used to run queries on StackQL:

CREATE PROCEDURE WebQuery 
@QueryText nvarchar(1000),
@IPHash binary(16)
AS
BEGIN

   --Log query
   INSERT INTO Logs (QueryText, IPHash, QueryHash) VALUES (@QueryText, @IPHash, HashBytes('md5', LTrim(RTrim(Upper(@QueryText))))

    -- no writing allowed, so no need to lock on select
   SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED 

   -- throttles
   SET ROWCOUNT 500 
   SET QUERY_GOVERNOR_COST_LIMIT 500

   exec (@QueryText)

END

Note the code for throttling (rowcount and cost_limit). I was never happy with this:

  • the cost limit control never really worked right - I either had to turn it too high to be at all effective or it was so low normal queries would keep bumping into it.
  • it's pretty easy to get past the throttling by including your own set commands in the query, but at least then I still have a log so I could check and see who is breaking my controls and why.
  • I played with the row limits, and anything > ~1500 performance problems

The query hash is a naive attempt to make it easier to spot identical queries, so I could do things like group by the hash and show counts. New version is instead making more of an emphasis on re-using old queries, so that you can plug in parameters without breaking the hash and I can update a count in the log directly.

6
  • I still have nightmares about SET ROWCOUNT it has some seriously nasty side affects. My throttling is done in the data reader (which works if you cancel the command) the main problem was that I forgot to do a ++rowCount somewhere :) I also already have query hash in place and I cache the results. The grid I am using seems to choke with 500 results, so I will look at a virtual one.
    – waffles
    Commented May 10, 2010 at 22:57
  • 2
    just fixed this issue
    – waffles
    Commented May 11, 2010 at 6:08
  • Note: slick grid is really happy with 2000 rows. I reckon I could even push it up a tad.
    – waffles
    Commented May 17, 2010 at 10:51
  • @waffles - the problem was on the database end, not the client, and also depended on how complex the query was Commented May 17, 2010 at 13:08
  • Yerp, I like your QUERY_GOVERNOR_COST_LIMIT trick, I may set it for the readonly user using db settings ... for the time being it seems that azure is coping fine with this, also there is a really cool trick I use, if you cancel your command after getting as many results as you wand from the reader, you can quickly close it.
    – waffles
    Commented May 20, 2010 at 2:42
  • @waffles the query governor is... touchy. It's weird what it will restrict one moment and allow the next, and it's difficult to tune correctly. Commented May 20, 2010 at 16:29
4

The CSV export seems to broken, its giving out part CSV, part something(JSON!?) output.

Sample output:

User Link,Reputation,Days,RepPerDays
{
  "title": "John T",
  "id": 1931
},56421,288,195
{
  "title": "Wil",
  "id": 4386
},35724,270,132
{
  "title": "unknown",
  "id": 8672
},23661,246,96
{
  "title": "nik",
  "id": 263
},20317,289,70
{
  "title": "quack quixote",
  "id": 12786
},13409,212,63

Link to query: http://odata.stackexchange.com/superuser/q/946

8
  • Will sort this out ASAP . sorry about it
    – waffles
    Commented May 24, 2010 at 23:45
  • OK I have logging in place, can you retry so I can capture the error
    – waffles
    Commented May 25, 2010 at 2:10
  • @waffles Just logged in from my home laptop, got the same error Commented May 25, 2010 at 3:00
  • Thanks I can see it ... sorting it out ...
    – waffles
    Commented May 25, 2010 at 3:09
  • ok your email was null which caused issues, I sorted it out in the DB and just committed a fix, you should be able to access stuff now
    – waffles
    Commented May 25, 2010 at 3:13
  • @waff yeps I can login now, thanks a lot! I can't edit my profile ? Commented May 25, 2010 at 3:16
  • yeah that is a work in progress :) should hopefully be done this week
    – waffles
    Commented May 25, 2010 at 3:22
  • @waff Ok - I guess I can be jon.doe2 for a week or two :)) Commented May 25, 2010 at 3:24
2

One word: STEXDEX.

2
  • this leaves me really confused :)
    – waffles
    Commented May 29, 2010 at 10:47
  • @waffles: I'm just saying that I've been referring to Stack Exchange Data Explorer as SEDE, and I think it's better to officially abbreviate it as stexdex. Commented May 29, 2010 at 12:39
2

Very nice one.

Here is the one I have tested for users pages, limit 35 on each result:

DECLARE @start INT
DECLARE @limit INT
DECLARE @page INT

SET @page = 1
SET @start = 1+35*(@page-1)
SET @limit = 35

SELECT * FROM
(
    SELECT row_number() OVER (ORDER BY Reputation DESC) AS i, *
    FROM users
) AS T
WHERE T.i BETWEEN (@start) AND (@start + @limit) 

And I've tried to log in with my google account but I am getting the following error:

Sorry, an error occurred while processing your request.

3
  • yerp, openid is not fully implemented, hope to finish it today
    – waffles
    Commented May 10, 2010 at 22:58
  • its implemented now ... but still need to write the use page
    – waffles
    Commented May 15, 2010 at 4:09
  • Thanks @waffles, I am going to try again.
    – YOU
    Commented May 15, 2010 at 4:15
1

The current result display is fairly clean and generally OK, but it doesn't play well with Chrome's web page search. It looks like it hides the text that has scrolled off the screen -- good for some optimisations, but it means you can't search within results.

2
  • that would be cause the results are not even in the DOM. the DOM basically dies a horrible death if you give it a 2000 row table
    – waffles
    Commented May 29, 2010 at 6:26
  • It can find currently displayed entries, sometimes, especially when the focus was in the result table when you type Ctrl+F.
    – Mark Hurd
    Commented May 29, 2010 at 7:19
1

Not a problem that I can see, but IE8 is reporting this at the moment:

Webpage error details

User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; MDDC; .NET CLR 3.5.30729; .NET CLR 3.0.30729) Timestamp: Sun, 30 May 2010 11:09:10 UTC

Message: Invalid argument. Line: 210 Char: 11 Code: 0 URI: http://odata.stackexchange.com/Scripts/codemirror/select.js

NOTE It occurred while I was editing and executing a query, but rerunning the query for meta and then for SO again and I can't repro it :-(

1
  • yerp, code mirror can be a bit flakey, I am considering dropping it in favor of bespin
    – waffles
    Commented May 31, 2010 at 4:51

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .