Tuesday, September 2, 2008

HOW-TO use the JSON Query Servlet

Sling provides a standard JSON Query Servlet in the package org.apache.sling.servlets.get that allows you to perform a search on the contents of the underlying repository.

The source of the servlet is available here:
http://svn.apache.org/viewvc/incubator/sling/trunk/servlets/get/src/main/java/org/apache/sling/servlets/get/RedirectServlet.java?view=markup

I am here discussing revision 660460. The source of that revision is at the very bottom of this post.

The servlet registers with the selector "query" and the extension "json" in lines 59 and 58.
That means that we can trigger it by accessing any node in the repository and append ".query" to it's name and ".json" as the extension.
The node we call is irrelevant. The servlet will always query the whole repository. The only thing we need is a resource that does exist, or the star resource.
So let's start off with a simple example that will work on any sling installation, whatever has been done to it, and return each and every node as a search result:

http://localhost:8080/content.query.json?queryType=xpath&statement=//*

This reveals the two required parameters for the servlet: queryType and statement.

  • queryType can be "xpath" or "sql"
  • statement is the search statement in the language specified through the queryType parameter.

The more intuitive language (at least to me and when dealing with structured content) is xpath, so let's see another example. This one will return all nodes of the primary type "nt:unstructured".
http://localhost:8080/content.query.json?queryType=xpath&statement=//*[@jcr:primaryType='nt:unstructured']

The same result will be returned with the following SQL query:
http://localhost:8080/content.query.json?queryType=sql&statement=select * from nt:base where jcr:primaryType='nt:unstructured'

This will (on my instance) give the following result. As you can see, for each result we get

  • the name of the node
  • the path
  • the score of the result entry
  • the primaryType
 [
    {
       "name":"org.apache.sling.launchpad.content",
       "jcr:path":"/var/sling/bundle-content/org.apache.sling.launchpad.content",
       "jcr:score":"1000",
       "jcr:primaryType":"nt:unstructured"
    },
    {
       "name":"mh.studies.sling.osgitest",
       "jcr:path":"/var/sling/bundle-content/mh.studies.sling.osgitest",
       "jcr:score":"1000",
       "jcr:primaryType":"nt:unstructured"
    },
    {
       "name":"test",
       "jcr:path":"/content/osgitest/test",
       "jcr:score":"1000",
       "jcr:primaryType":"nt:unstructured"
    }
 ]

 

But you can also use the full power of the jcr-specific xpath functions, e.g.
http://localhost:8080/content.query.json?queryType=xpath&statement=//*[jcr:contains(.,'sling')]

This searches for all nodes that contain the word "sling".

The offset parameter will skip as many rows in the result as given (lines 110-116).

The rows parameter states the maximum number of rows that will be returned.

The property parameter can be used to extract additional properties from the node.
It can be used multiple times.
http://localhost:8080/content.query.json?statement=//*[@jcr:primaryType='nt:file']&property=jcr:content/jcr:mimeType

will return:

  1: [
  2:    {
  3:       "name":"index.html",
  4:       "jcr:path":"/index.html",
  5:       "jcr:score":"1000",
  6:       "jcr:primaryType":"nt:file",
  7:       "jcr:content/jcr:mimeType":"text/html"
  8:    },
  9:    {
 10:       "name":"sling-logo.png",
 11:       "jcr:path":"/sling-logo.png",
 12:       "jcr:score":"1000",
 13:       "jcr:primaryType":"nt:file",
 14:       "jcr:content/jcr:mimeType":"image/png"
 15:    },
 16:    {
 17:       "name":"assert.js",
 18:       "jcr:path":"/sling-test/sling/assert.js",
 19:       "jcr:score":"1000",
 20:       "jcr:primaryType":"nt:file",
 21:       "jcr:content/jcr:mimeType":"application/x-javascript"
 22:    },
 23:    {
 24:       "name":"sling-test.html",
 25:       "jcr:path":"/sling-test/sling/sling-test.html",
 26:       "jcr:score":"1000",
 27:       "jcr:primaryType":"nt:file",
 28:       "jcr:content/jcr:mimeType":"text/html"
 29:    },
 30:    {
 31:       "name":"sling.css",
 32:       "jcr:path":"/sling.css",
 33:       "jcr:score":"1000",
 34:       "jcr:primaryType":"nt:file",
 35:       "jcr:content/jcr:mimeType":"text/css"
 36:    },
 37:    {
 38:       "name":"GET.esp",
 39:       "jcr:path":"/apps/samples/osgitest/GET.esp",
 40:       "jcr:score":"1000",
 41:       "jcr:primaryType":"nt:file",
 42:       "jcr:content/jcr:mimeType":"text/plain"
 43:    }
 44: ]

The excerptPath parameter specifies the argument to the rep:excerpt() function.
You might want to have a look here: http://wiki.apache.org/jackrabbit/ExcerptProvider

So what is the rep:excerpt() function? The rep:excerpt() function returns an excerpt of the node that was found and highlights search terms. The excerpt is an XML fragment, and is included in the JSON code as a "rep:excerpt" key in the returned array.

Let's have an example:

http://localhost:8080/content.query.json?queryType=xpath&statement=//*[jcr:contains(.,'sling')]/(rep:excerpt(.))

This will return the following JSON that includes

  • the name
  • the path
  • the excerpt
  • the score

for each entry (note it no longer returns the primaryType).

  1: [
  2:    {
  3:       "name":"",
  4:       "jcr:path":"/",
  5:       "rep:excerpt()":"<div><span><strong>sling<\/strong>:redirect /index.html<\/span><\/div>",
  6:       "jcr:score":"1000"
  7:    },
  8:    {
  9:       "name":"jcr:content",
 10:       "jcr:path":"/index.html/jcr:content",
 11:       "rep:excerpt()":"<div><span> Welcome to the <strong>Sling<\/strong> Launchpad Welcome to the <strong>Sling<\/strong> Launchpad Apache <strong>Sling<\/strong> currently in incubation is a web framework that uses a Java Content<\/span><span>... content bundles can be loaded unloaded and reconfigured at runtime The <strong>Sling<\/strong> Launchpad is a ready to run <strong>Sling<\/strong> configuration providing an embedded JCR content repository and web server<\/span><span>... root URL as the WebDAV server URL Use our mailing lists to contact the <strong>Sling<\/strong> developers team The <strong>Sling<\/strong> OSGi management console is available at system console The Sling client ...<\/span><\/div>",
 12:       "jcr:score":"500"
 13:    },
 14:    {
 15:       "name":"jcr:content",
 16:       "jcr:path":"/sling-test/sling/sling-test.html/jcr:content",
 17:       "rep:excerpt()":"<div><span>... px background color red color white padding em font weight bold Automated <strong>Sling<\/strong> client library tests TODO for now running these tests requires setting<\/span><span>... undefined typeof data dummy function testGetSessionInfo var session <strong>Sling<\/strong> getSessionInfo assertNotNull <strong>Sling<\/strong> getSessionInfo session assertEquals session userID is a string string<\/span><span>... indexOf default function testRemoveContent var deletePath baseTestPath <strong>sling<\/strong> test testhtml nodes delete now var c slingPost deletePath title hello ...<\/span><\/div>",
 18:       "jcr:score":"313"
 19:    }
 20: ]

 

This is quite unsatisfactory. What we really want is not the jcr:content nodes, right? We want their parent nodes, but still have the excerpt from the content.

So, let's refine in a first step and get the parents as the search result:
http://localhost:8080/content.query.json?queryType=xpath&statement=//*[jcr:contains(jcr:content,'sling')]/(rep:excerpt(.))

This gives us the following. This is OK in terms of nodes returned -- it gives us the nodes of which the content contains "sling".
But now the excerpts are ruined, as the node itself does not contain the word "sling", but the excerpt is drawn from the node.

  1: [
  2:    {
  3:       "name":"index.html",
  4:       "jcr:path":"/index.html",
  5:       "rep:excerpt()":"<excerpt><fragment><\/fragment><\/excerpt>",
  6:       "jcr:score":"1000"
  7:    },
  8:    {
  9:       "name":"sling-test.html",
 10:       "jcr:path":"/sling-test/sling/sling-test.html",
 11:       "rep:excerpt()":"<excerpt><fragment><\/fragment><\/excerpt>",
 12:       "jcr:score":"903"
 13:    }
 14: ]

 

*STOPPER*
After struggling through JPDA sessions for a whole while, I must confess that I am a bit lost about excerptPath.
Looking at lines 155ff, it would seem that it is impossible for a result row to contain REP_EXCERPT = "rep:excerpt()" and "rep:excerpt(" + exerptPath + ")" at the same time, if exerptPath is anything else but "".

But this may be because I am not too well acquainted with the jcr functions of xpath.
If ever I get a handle on this, I will update this post. Or maybe someone can provide a meaningful example?

My idea was that I can modify above query to the following, so that the parent node of the respective node is returned, and the excerpt of the jcr:content path.
http://localhost:8080/content.query.json?queryType=xpath&statement=//*[jcr:contains(jcr:content,'sling')]/(rep:excerpt(jcr:content))&excerptPath=jcr:content

Source:

  1: /*
  2:  * Licensed to the Apache Software Foundation (ASF) under one
  3:  * or more contributor license agreements.  See the NOTICE file
  4:  * distributed with this work for additional information
  5:  * regarding copyright ownership.  The ASF licenses this file
  6:  * to you under the Apache License, Version 2.0 (the
  7:  * "License"); you may not use this file except in compliance
  8:  * with the License.  You may obtain a copy of the License at
  9:  *
 10:  *   http://www.apache.org/licenses/LICENSE-2.0
 11:  *
 12:  * Unless required by applicable law or agreed to in writing,
 13:  * software distributed under the License is distributed on an
 14:  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 15:  * KIND, either express or implied.  See the License for the
 16:  * specific language governing permissions and limitations
 17:  * under the License.
 18:  */
 19: package org.apache.sling.servlets.get;
 20: 
 21: import java.io.IOException;
 22: import java.io.InputStream;
 23: import java.util.ArrayList;
 24: import java.util.Iterator;
 25: import java.util.List;
 26: import java.util.Map;
 27: 
 28: import javax.jcr.RepositoryException;
 29: import javax.jcr.Value;
 30: import javax.jcr.query.Query;
 31: 
 32: import org.apache.sling.api.SlingException;
 33: import org.apache.sling.api.SlingHttpServletRequest;
 34: import org.apache.sling.api.SlingHttpServletResponse;
 35: import org.apache.sling.api.resource.Resource;
 36: import org.apache.sling.api.resource.ResourceResolver;
 37: import org.apache.sling.api.resource.ResourceUtil;
 38: import org.apache.sling.api.servlets.SlingSafeMethodsServlet;
 39: import org.apache.sling.commons.json.JSONException;
 40: import org.apache.sling.commons.json.io.JSONWriter;
 41: import org.apache.sling.jcr.resource.JcrResourceUtil;
 42: import org.apache.sling.servlets.get.helpers.JsonRendererServlet;
 43: import org.slf4j.Logger;
 44: import org.slf4j.LoggerFactory;
 45: 
 46: /**
 47:  * A SlingSafeMethodsServlet that renders the search results as JSON data
 48:  *
 49:  * @scr.component immediate="true" metatype="no"
 50:  * @scr.service interface="javax.servlet.Servlet"
 51:  * 
 52:  * @scr.property name="service.description" value="Default Query Servlet"
 53:  * @scr.property name="service.vendor" value="The Apache Software Foundation"
 54:  * 
 55:  * Use this as the default query servlet for json get requests for Sling
 56:  * @scr.property name="sling.servlet.resourceTypes"
 57:  *               value="sling/servlet/default"
 58:  * @scr.property name="sling.servlet.extensions" value="json"
 59:  * @scr.property name="sling.servlet.selectors" value="query"
 60:  */
 61: public class JsonQueryServlet extends SlingSafeMethodsServlet {
 62:     private final Logger log = LoggerFactory.getLogger(JsonQueryServlet.class);
 63: 
 64:     /** Search clause */
 65:     public static final String STATEMENT = "statement";
 66: 
 67:     /** Query type */
 68:     public static final String QUERY_TYPE = "queryType";
 69: 
 70:     /** Result set offset */
 71:     public static final String OFFSET = "offset";
 72: 
 73:     /** Number of rows requested */
 74:     public static final String ROWS = "rows";
 75: 
 76:     /** property to append to the result */
 77:     public static final String PROPERTY = "property";
 78: 
 79:     /** exerpt lookup path */
 80:     public static final String EXCERPT_PATH = "excerptPath";
 81: 
 82:     /** rep:exerpt */
 83:     private static final String REP_EXCERPT = "rep:excerpt()";
 84: 
 85:     @Override
 86:     protected void doGet(SlingHttpServletRequest req,
 87:             SlingHttpServletResponse resp) throws IOException {
 88:         dumpResult(req, resp);
 89:     }
 90: 
 91:     /**
 92:      * Dumps the result as JSON object.
 93:      *
 94:      * @param req request
 95:      * @param resp response
 96:      * @throws IOException in case the search will unexpectedly fail
 97:      */
 98:     private void dumpResult(SlingHttpServletRequest req,
 99:             SlingHttpServletResponse resp) throws IOException {
100:         try {
101:             ResourceResolver resolver = req.getResourceResolver();
102: 
103:             String statement = req.getParameter(STATEMENT);
104:             String queryType = (req.getParameter(QUERY_TYPE) != null && req.getParameter(
105:                 QUERY_TYPE).equals(Query.SQL)) ? Query.SQL : Query.XPATH;
106: 
107:             Iterator<Map<String, Object>> result = resolver.queryResources(
108:                 statement, queryType);
109: 
110:             if (req.getParameter(OFFSET) != null) {
111:                 long skip = Long.parseLong(req.getParameter(OFFSET));
112:                 while (skip > 0 && result.hasNext()) {
113:                     result.next();
114:                     skip--;
115:                 }
116:             }
117: 
118:             resp.setContentType(JsonRendererServlet.responseContentType);
119:             resp.setCharacterEncoding("UTF-8");
120: 
121:             final JSONWriter w = new JSONWriter(resp.getWriter());
122:             w.array();
123: 
124:             long count = -1;
125:             if (req.getParameter(ROWS) != null) {
126:                 count = Long.parseLong(req.getParameter(ROWS));
127:             }
128: 
129:             List<String> properties = new ArrayList<String>();
130:             if (req.getParameterValues(PROPERTY) != null) {
131:                 for (String property : req.getParameterValues(PROPERTY)) {
132:                     properties.add(property);
133:                 }
134:             }
135: 
136:             String exerptPath = "";
137:             if (req.getParameter(EXCERPT_PATH) != null) {
138:                 exerptPath = req.getParameter(EXCERPT_PATH);
139:             }
140: 
141:             // iterate through the result set and build the "json result"
142:             while (result.hasNext() && count != 0) {
143:                 Map<String, Object> row = result.next();
144: 
145:                 w.object();
146:                 String path = row.get("jcr:path").toString();
147: 
148:                 w.key("name");
149:                 w.value(ResourceUtil.getName(path));
150: 
151:                 // dump columns
152:                 for (String colName : row.keySet()) {
153:                     w.key(colName);
154:                     String strValue = "";
155:                     if (colName.equals(REP_EXCERPT)) {
156:                         Object ev = row.get("rep:excerpt(" + exerptPath + ")");
157:                         strValue = (ev == null) ? "" : ev.toString();
158:                     } else {
159:                         strValue = formatValue(row.get(colName));
160:                     }
161:                     w.value(strValue);
162:                 }
163: 
164:                 // load properties and add it to the result set
165:                 if (!properties.isEmpty()) {
166:                     Resource nodeRes = resolver.getResource(path);
167:                     dumpProperties(w, nodeRes, properties);
168:                 }
169: 
170:                 w.endObject();
171:                 count--;
172:             }
173:             w.endArray();
174:         } catch (JSONException je) {
175:             throw wrapException(je);
176:         }
177:     }
178: 
179:     private void dumpProperties(JSONWriter w, Resource nodeRes,
180:             List<String> properties) throws JSONException {
181: 
182:         // nothing to do if there is no resource
183:         if (nodeRes == null) {
184:             return;
185:         }
186: 
187: 
188:         ResourceResolver resolver = nodeRes.getResourceResolver();
189:         for (String property : properties) {
190:             Resource prop = resolver.getResource(nodeRes, property);
191:             if (prop != null) {
192:                 String strValue;
193:                 Value value = prop.adaptTo(Value.class);
194:                 if (value != null) {
195:                     strValue = formatValue(value);
196:                 } else {
197:                     strValue = prop.adaptTo(String.class);
198:                     if (strValue == null) {
199:                         strValue = "";
200:                     }
201:                 }
202:                 w.key(property);
203:                 w.value(strValue);
204:             }
205:         }
206: 
207:     }
208: 
209:     private String formatValue(Value value) {
210:         try {
211:             return formatValue(JcrResourceUtil.toJavaObject(value));
212:         } catch (RepositoryException re) {
213:             // might log
214:         }
215:         return "";
216:     }
217: 
218:     private String formatValue(Object value) {
219:         String strValue;
220:         if (value instanceof InputStream) {
221:             // binary value comes as a LazyInputStream
222:             strValue = "[binary]";
223: 
224:             // just to be clean, close the stream
225:             try {
226:                 ((InputStream) value).close();
227:             } catch (IOException ignore) {
228:             }
229:         } else if (value != null) {
230:             strValue = value.toString();
231:         } else {
232:             strValue = "";
233:         }
234: 
235:         return strValue;
236:     }
237: 
238:     /**
239:      * @param e
240:      * @throws org.apache.sling.api.SlingException wrapping the given exception
241:      */
242:     private SlingException wrapException(Exception e) {
243:         log.warn("Error in QueryServlet: " + e.toString(), e);
244:         return new SlingException(e.toString(), e);
245:     }
246: }
247: 

1 comment:

Unknown said...

Hi Moritz, I would like to say thank you for writing this blog and contributing excellent documentation to the Sling/CRX Quickstart community. It cannot be stressed enough how much I appreciate your efforts.

Lars