DapperCamp Wiki > Documentation > Dapper XML Format

Dapper XML Format

Table of contents
No headers

The core Dapper output format was developed by Dapper and captures the content from a website in a nested hierarchy.  We'll use the following example to explain the XML:

./ishot-1.jpg

The root node of every Dapp XML is named elements.  Every Dapp XML contains at least one child of the elements node named dapper, which contains metadata relevant to the Dapp and the specific execution.  Here are some of the nodes the dapper node may contain:

  • dappTitle: the title of the Dapp
  • dappName: the identifier of the Dapp (same as dappTitle, but with only letters and numbers)(
  • url: a URL that was in the original sample set when the Dapp was created
  • applyToUrl: the URL on which the Dapp was just run
  • executionTime: the amount of time Dapper spent within the algorithms extracting content
  • ranAt: the time at which the Dapp was run (useful in debugging caching)

After the dapper node, there are two types of nodes you can encounter: "group" nodes and "field" nodes. 

Field nodes repesent a field in the Dapp (and are named by whatever the creator of the Dapp called them in the Dapp Factory, modified so they are safe for XML).  The contain content from the original website.  Each field node has the following attributes:

  • fieldName: the original name of the field, before it was made safe for XML
  • originalElement: the corresponding element in the HTML document
  • type: always set to "field" - useful for xquery and manipulating the DOM
  • href: the href of the element or any ancestor element in the HTML - if it is a link in the HTML, the destination of the link will be in this attribute (optional)
  • src: the src of the element or any acestor element in the HTML - if it is an image in the HTML, the URL of the image will be in this attribute (optional)

Group nodes group together one or more field nodes.  Nodes is also needed on writing custom papers. Grouping is determined manually by the creator of the Dapp in the Dapp Factory.  The same rules apply to group nodes in terms of being renamed to be XML safe.  Group nodes contain the following attributes:

  • groupName: the original name of hte group, before it was made safe for XML
  • type: always set to "group" - useful for xquery and manipulating the DOM


Tag page
Viewing 6 of 6 comments: view all
It’s not very easy to learn useful about this post,but essay writers can advice to buy essays to receive correct knowledge and it is ealizable to purchase custom written essay per very small costs!
Posted 11:04, 27 Nov 2009
Is the page suppose to be like that above? Aaron K web hosting
Posted 21:58, 21 Jan 2010
Thank you very much for posting Look very interesting. Shop Buy Cheap Discount 24Hrs. Hoop Glider and Ottoman : Intermatic Timer Switch Shop: Keurig Coffee Maker Shop : Juice Fountain
Posted 10:42, 3 Feb 2010
Thanx for sharing this with all of us. Of course, what a great site and informative posts, I will bookmark this site. keep doing your great job and always gain my support. prom dresses prom dresses 2010 evening Dresses casual Dresses Cocktail dress formal Dresses holiday Dresses Celebrity Dresses bridal gowns edited 06:17, 12 Mar 2010
Posted 09:29, 22 Feb 2010
It looks like an custom essay. i like that edited 14:59, 4 Mar 2010
Posted 14:59, 4 Mar 2010
nice to be here.... thanks for share

nowGoogle.com adalah Multiple Search Engine Popular|intermezo

Posted 09:23, 6 Mar 2010
Viewing 6 of 6 comments: view all
You must login to post a comment.