86

Let's say I have three resources that are related like so:

Grandparent (collection) -> Parent (collection) -> and Child (collection)

The above depicts the relationship among these resources like so: Each grandparent can map to one or several parents. Each parent can map to one or several children. I want the ability to support searching against the child resource but with the filter criteria:

If my clients pass me an id reference to a grandparent, I want to only search against children who are direct descendants of that grandparent.

If my clients pass me an id reference to a parent, I want to only search against children who are direct descendants of my parent.

I have thought of something like so:

GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}

and

GET /myservice/api/v1/parents/{parentID}/children?search={text}

for the above requirements, respectively.

But I could also do something like this:

GET /myservice/api/v1/children?search={text}&grandparentID={id}&parentID=${id}

In this design, I could allow my client to pass me one or the other in the query string: either grandparentID or parentID, but not both.

My questions are:

1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.

2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.

3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.

Thanks!

4
  • 3
    The title of your question should be extremely confusing to anyone viewing this. The valid segments of a URI are defined as <scheme>://<user>:<password>@<host>:<port>/<path>;<params>?<query>/#<fragment> (although <password> is deprecated) A "query string" is a valid component of a URI so your "vs" in the title is crazy talk. Commented Jun 3, 2015 at 14:26
  • Do you mean I want to only search against children who are INdirect descendants of that grandparent. ? According to your structure, Grandparent has no direct children.
    – null
    Commented Jun 3, 2015 at 15:25
  • What is the diference between a child and a parent? Is a parent a parent if he doesnt have children? Smells of a design fault
    – Pinoniq
    Commented Jun 4, 2015 at 11:10
  • re: potential design flaw and if you have information about a person but no information on their parents, do they qualify as a child? (e.g., Adam and Eve ) :) Commented Jul 10, 2018 at 21:17

7 Answers 7

74

First

As Per RFC 3986 §3.4 (Uniform Resource Identifiers § (Syntax Components)|Query

3.4 Query

The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource within the scope of the URI's scheme and naming authority (if any).

Query components are for retrieval of non-hierarchical data; there are few things more hierarchical in nature than a family tree! Ergo - regardless of whether you think it is "REST-y" or not- in order to conform to the formats, protocols, and frameworks of and for developing systems on the internet, you must not use the query string to identify this information.

REST has nothing to do with this definition.

Before addressing your specific questions, your query parameter of "search" is poorly named. Better would be to treat your query segment as a dictionary of key-value pairs.

Your query string could be more appropriately defined as

?first_name={firstName}&last_name={lastName}&birth_date={birthDate} etc.

To answer your specific questions

  1. Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.

I don't think this is as clear cut as you seem to believe.

None of these resource interfaces are RESTful. The major precondition for the RESTful architectural style is that Application State transitions must be communicated from the server as hypermedia. People have labored over the structure of URIs to make them somehow "RESTful URIs" but the formal literature regarding REST actually has very little to say about this. My personal opinion is that much of the meta-misinformation about REST was published with the intent of breaking old, bad habits. (Building a truly "RESTful" system is actually quite a bit of work. The industry glommed on to "REST" and back-filled some orthogonal concerns with nonsensical qualifications and restrictions. )

What the REST literature does say is that if you are going to use HTTP as your application protocol, you must adhere to the formal requirements of the protocol's specifications and you cannot "make http up as you go and still declare that you are using http"; if you are going to use URIs for identifying your resources, you must adhere to the formal requirements of the specifications regarding URI/URLs.

Your question is addressed directly by RFC3986 §3.4, which I have linked above. The bottom line on this matter is that even though a conforming URI is insufficient to consider an API "RESTful", if you want your system to actually be "RESTful" and you are using HTTP and URIs, then you cannot identify hierarchical data through the query string because:

3.4 Query

The query component contains non-hierarchical data

...it's as simple as that.

  1. What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.

The "pros" of the first two is that they are on the right path. The "cons" of the third one is that it appears to be flat out wrong.

As far as your understandability and maintainability concerns, those are definitely subjective and depend on the comprehension level of the client developer and the design chops of the designer. The URI specification is the definitive answer as to how URIs are supposed to be formatted. Hierarchical data is supposed to be represented on the path and with path parameters. Non-hierarchical data is supposed to be represented in the query. The fragment is more complicated, because its semantics depend specifically upon the media type of the representation being requested. So to address the "understandability" component of your question, I will attempt to translate exactly what your first two URIs are actually saying. Then, I will attempt to represent what you say you are trying to accomplish with valid URIs.

Translation of your verbatim URIs to their semantic meaning /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text} This says for the parents of grandparents, find their child having search={text} What you said with your URI is only coherent if searching for a grandparent's siblings. With your "grandparents, parents, children" you found a "grandparent" went up a generation to their parents and then came back down to the "grandparent" generation by looking at the parents' children.

/myservice/api/v1/parents/{parentID}/children?search={text} This says that for the parent identified by {parentID}, find their child having ?search={text} This is closer to correct to what you are wanting, and represents a parent->child relationship that can likely be used to model your entire API. To model it this way, the burden is placed upon the client to recognize that if they have a "grandparentId", that there is a layer of indirection between the ID they have and the portion of the family graph they are wishing to see. To find a "child" by "grandparentId", you can call your /parents/{parentID}/children service and then foreach child that is returned, search their children for your person identifier.

Implementation of your requirements as URIs If you want to model a more extensible resource identifier that can walk the tree, I can think of several ways you can accomplish that.

1) The first one, I've already alluded to. Represent the graph of "People" as a composite structure. Each person has a reference to the generation above it through its Parents path and to a generation below it through its Children path.

/Persons/Joe/Parents/Mother/Parents would be a way to grab Joe's maternal grandparents.

/Persons/Joe/Parents/Parents would be a way to grab all of Joe's grandparents.

/Persons/Joe/Parents/Parents?id={Joe.GrandparentID} would grab Joe's grandparent having the identifier you have in hand.

and these would all make sense (note that there could be a performance penalty here depending on task by forcing a dfs on the server due to a lack of branch identification in the "Parents/Parents/Parents" pattern.) You also benefit from having the ability to support any arbitrary number of generations. If, for some reason, you desire to look up 8 generations, you could represent this as

/Persons/Joe/Parents/Parents/Parents/Parents/Parents/Parents/Parents/Parents?id={Joe.NotableAncestor}

but this leads into the second dominant option for representing this data: through a path parameter.


2) Use path parameters to "query the hierarchy" You could develop the following structure to help ease the burden on consumers and still have an API that makes sense.

To look back 147 generations, representing this resource identifier with path parameters allows you to do

/Persons/Joe/Parents;generations=147?id={Joe.NotableAncestor}

To locate Joe from his Great Grandparent, you could look down the graph a known number of generations for Joe's Id. /Persons/JoesGreatGrandparent/Children;generations=3?id={Joe.Id}

The major thing of note with these approaches is that without further information in the identifier and request, you should expect that the first URI is retrieving a Person 147 generations up from Joe with the identifier of Joe.NotableAncestor. You should expect the second one to retrieve Joe. Assume that what you actually want is for your calling client to be able to retrieve the entire set of nodes and their relationships between the root Person and the final context of your URI. You could do that with the same URI (with some additional decoration) and setting an Accept of text/vnd.graphviz on your request, which is the IANA registered media type for the .dot graph representation. With that, change the URI to

/Persons/Joe/Parents;generations=147?id={Joe.NotableAncestor.Id}#directed

with an HTTP Request Header Accept: text/vnd.graphviz and you can have clients fairly clearly communicate that they want the directed graph of the generational hierarchy between Joe and 147 generations prior where that 147th ancestral generation contains a person identified as Joe's "Notable Ancestor."

I'm unsure if text/vnd.graphviz has any pre-defined semantics for its fragment;I could find none in a search for instruction. If that media type actually does have pre-defined fragment information, then its semantics should be followed to create a conforming URI. But, if those semantics are not pre-defined, the URI specification states that the semantics of the fragment identifier are unconstrained and instead defined by the server, making this usage valid.


  1. What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.

I believe I have already thoroughly beaten this to death, but query strings are not for "filtering" resources. They are for identifying your resource from non-hierarchical data. If you have drilled down your hierarchy with your path by going /person/{id}/children/ and you are wishing to identify a specific child or a specific set of children, you would use some attribute that applies to the set you are identifying and include it inside the query.

15
  • 1
    The RFC is only concerned with hierarchy insofar as it defines a syntax and algorithm for resolving relative URI references. Could you elaborate or cite some sources explaining why the examples in the original post are not conforming? Commented Jun 30, 2015 at 13:58
  • 2
    Isn't a family tree really a graph not a tree, and not at all hierarchical. considering multiple parents, divorce and re-marriage etc.
    – Myster
    Commented Feb 18, 2016 at 1:49
  • 1
    @RobertoAloi It seems counterintuitive to me to communicate your own "No Items Found" interface through an empty set when HTTP already has a definition for that. The basic principle is that you are asking the server to return "thing(s)" and if there are no "thing(s)" to return, the server communicates that with "404 - Not Found" What's counterintuitive about that? Commented Jan 30, 2017 at 17:24
  • 1
    I always believed 404 to indicate that the root URL of the resource was not found, i.e. the collection as a whole. So If you queried /books?author=Jim and there were no books by jim, yo'd receive an empty set []. But if you queried /articles?author=Jim but the articles resource didn't even exist as a collection 404 would help to indicate that there's no use in looking for any articles at all.
    – ADJenks
    Commented Oct 24, 2018 at 9:29
  • 1
    @adjenks You didn't learn that from formal specifications. Structurally, the constituents of a url can be thought of as containing a "root" if that helps you reason about the purposes for the constituent parts, but ultimately the query string is not a display filter against a resource identified via the path. The query string is a first class citizen of a url. When you find no resources on the server which match your url (including the query string) 404 is the means defined within http to communicate this. The interaction model you pose introduces a distinction without a difference. Commented Oct 24, 2018 at 13:33
17

This is where you get it wrong:

If my clients pass me an id reference

In a REST systems, client should never be bothered with IDs. The only resource identifiers that the client should know about should be URIs. This is the principle of "uniform interface".

Think about how clients would interact with your system. Say the user starts browsing through a list of grandparents, he picked one of grandparent's child, that brings him to /grandparent/123. If the client should be able to search the children of /grandparent/123, then according to "HATEOAS", whatever returned when you do a query on /grandparent/123 should return a URL to the search interface. This URL should already have whatever data is needed to filter by the current grandparent embedded in it.

Whether the link looks like /grandparent/123?search={term} or /parent?grandparent=123&search={term} or /parent?grandparentTerm=someterm&someothergplocator=blah&search={term} are inconsequential according to REST. Notice how all of those URLs have the same number of parameters, which is {term}, even though they use different criterias. You can switch between any of those URLs or you can mix them up depending on the specific grandparents and the client wouldn't break, because the logical relationship between the resources are the same even though the underlying implementation might differ significantly.

If you had instead created the service such that it requires /grandparent/{grandparentID}?search={term} when you go one way but /children?parent={parentID}&search={term} a} when you go another way, that is too much coupling because the client would have to know to interpolate different things on different relations that are conceptually similar.

Whether you actually go with /grandparent/123?search={term} or /parent?grandparent=123&search={term} is a matter of taste and whichever implementation is easier for you right now. The important thing is to not require the client to be modified if you change your URL strategy or if you use different strategies on different parents-children relations.

10

I'm not sure why people think putting the ID values in the URL means its somehow a REST API, REST is about handling verbs, passing resources.

So if you want to PUT a new user, you'd have to send a fair chunk of data and a POST http request is ideal, so although you might send the key (eg. user id), you'll send the user data (eg name, address) as POST data.

Now it is a common idiom to put the resource identifier in the URI, but this is more convention than any form of canonical "its not REST if its not in the URI". Remember that the original thesis of REST doesn't really mention http at all, its an idiom for handling data in a client-server, not something that is an extension to http (though, obviously, http is our primary form of implementing REST).

For example, Fielding uses the request of an academic paper as an example. You want to retrieve the resource "Dr John's Paper on the drinking of beers", but you might also want the initial version, or the latest version, so the resource identifier might not be something that is easily referenced as a single ID that can be placed in the URI. REST allows for this and a stated intention behind it is:

REST relies instead on the author choosing a resource identifier that best fits the nature of the concept being identified

So there's nothing stopping you from using a static URI to retrieve your parents, passing in a search term in the query string to identify the exact user you're after. In this case, the 'resource' you're identifying is the set of grandparents (and so the URI contains 'grandparents' as part of the URI. REST includes the concept of 'control data' that is designed for determining which representation of your resource is to be retrieved - the example given in the these is cache control, but also version control - so my request for Dr John's excellent paper can be refined by passing the version as control data, not part of the URI.

I think an example of REST interface that is not usually mentioned is SMTP. When constructing a mail message, you send verbs (FROM, TO etc) with the resource data for each part of the mail message. This is RESTful even though it doesn't use the http verbs, it uses its own set.

So... whilst you do need to have some resource identification in your URI, it doesn't have to be your id reference. This can happily be sent as control data, in the query string or even in POST data. What you're really identifying in your REST API is that you're after a child, which you already have in your URI.

So to my mind, reading the definition of REST, you're requesting a child resource, passing in control data (in the form of querystring) to determine which one you want returned. As a result, you cannot make a URI request for a grandparent or parent. You want children returned so the term parent/grandparent or id should definitely not be in such a URI. Children should be.

1
  • Actually, URI as a concept is central to REST. What is not as important though is the URI syntax. A proper REST interface must identify resources with a common syntax. At the basest of REST principle, it is not necessarily bad to identify complex resource with a JSON object instead of RFC 3986 URI, though if you do so, you should use JSON RI for your entire API.
    – Lie Ryan
    Commented Jun 12, 2016 at 2:03
8

A lot of people have allready talked about what REST means, etc etc. But none seem to address the real issue: your design.

Why is grandparent different from a father? they both have children that can possibly have children that can ...

Eventually, they are all 'human'. You probably have that in your code as well. So use that:

GET /<api-endpoint>/humans/<ID>

Will return some usefull info about the human. (Name, and stuff)

GET /<api-endpoint>/humans/<ID>/children

will obviously return an array of children. If no children exist, an empty array is probably the way to go.

To make it easy, you could for instance add a flag 'hasChildren'. or 'hasGrandChildren'.

Think smarter not harder

1
  • underrated answer
    – Vitim.us
    Commented Jun 19, 2020 at 18:40
1

The following is more RESTfull because every grandparentID gets it own URL. This way the resource gets identified in a unique way.

GET /myservice/api/v1/grandparents/{grandparentID}
GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}

The query parameter search is a good way to execute a search from the context of that resource.

When a family is getting very large you can use start/limit as query options for example:

GET /myservice/api/v1/grandparents/{grandparentID}/children?start=1&limit=50

As a developer it is good to have different resources with a unique URL/URI. I think you should use query parameter only when they also could be left out.

Maybe this is a good read http://www.thoughtworks.com/insights/blog/rest-api-design-resource-modeling and otherwise the original PhD thesis of Roy T Fielding https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf that explains the concepts very well and complete.

1

I'll go with

/myservice/api/v1/people/{personID}/descendants/2?search={text}

In this case every prefix is a valid resource:

  • /myservice/api/v1/people: all people
  • /myservice/api/v1/people/{personID}: one person with complete information(including ancestors, siblings, descendants)
  • /myservice/api/v1/people/{personID}/descendants: one person's descendants
  • /myservice/api/v1/people/{personID}/descendants/2: one person's grandchildren

May not be the best answer, but at least makes sense to me.

-1

many before me have already pointed out that URL format is inconsequential in the context of RESTful service...my two cents would be...an important aspect of REST as advocated in the original write up was the concept of 'Resources'...when you design your URL one aspect you may need to keep in mind is that a 'Resource' is not a row in a single table (though it could be) ..rather it is a representation..that is also consistent ...and can be used to make changes to the state of that representation (and effectively to underling medium of storage)..and a given resource may only meaningful in your business context..for example;

In your example /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}

Could be a more meaningful resource if you shorten this up /myservice/api/v1/siblingsOfGrandParents?search=..

in this case you can declare 'siblingsOfGrandParents' as resource..this simplifies the URL

As others pointed out there is a widespread misguided notion that you need to fit in every type of hierarchical relationship among domains object in more explicit form in a URL..there is no hard and fast rule to have 1-to1 mapping between domain objects and resources and represent all their relationships...the question should rather be I believe...is there a need to expose such relationship via URLs...specifically path segments...perhaps not.

2
  • when you spend your precious time downgrading an answer, it would be helpful if you can speak up and give your reasons. Commented Dec 1, 2017 at 14:51
  • I'm not completely sure why downvoted based on reading your answer. I agree with you. A few immediate things that jump out to me is that you provided a slightly tangent example by mentioning "siblings" when the OP was asking about hierarchical relationships. Perhaps also your writing style could be improved. You use lots of "...", we tend to not use "..." a lot in formal, professional, or instructional writing. There's also a bit of redundancy (e.g. you mention for example, then say in your example). Perhaps your answer could be more concise and more directly address the OPs question.
    – KennyCason
    Commented Feb 5, 2019 at 18:21

Not the answer you're looking for? Browse other questions tagged or ask your own question.