java - Disable XML Entity resolving in JDOM / DOM -


i writing java application postprocessing of xml files. these xml files come rdf-export of semantic mediawiki, have rdf/xml syntax.

my problem following: when read xml file, entities in file resolved value specified in doctype. example in doctype have

<!doctype rdf:rdf[ <!entity wiki 'http://example.org/smartgrid/index.php/special:uriresolver/'> .. ]> 

and in root element

<rdf:rdf xmlns:wiki="&wiki;" .. > 

this means

<swivt:subject rdf:about="&wiki;main_page"> 

becomes

<swivt:subject rdf:about="http://example.org/smartgrid/index.php/special:uriresolver/main_page"> 

i have tried using jdom , standard java dom. code think relevant here standard dom:

documentbuilderfactory factory = documentbuilderfactory.newinstance();         factory.setexpandentityreferences(false);         factory.setfeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); 

and jdom

saxbuilder builder = new saxbuilder();     builder.setexpandentities(false); //retain entities     builder.setvalidation(false);     builder.setfeature("http://xml.org/sax/features/resolve-dtd-uris", false); 

but entities resolved throughout whole xml document none less. missing something? hours of search has led me 'expandentities' commands, don't seem work.

any hint highly appreciated :)

i recommend jdom faq:

[http://www.jdom.org/docs/faq.html#a0350]

how keep dtd loading? when turn off validation parser tries load dtd file.

even when validation turned off, xml parser default load external dtd file in order parse dtd external entity declarations. xerces has feature turn off behavior named "http://apache.org/xml/features/nonvalidating/load-external-dtd" , if know you're using xerces can set feature on builder.

builder.setfeature(   "http://apache.org/xml/features/nonvalidating/load-external-dtd", false); 

if you're using parser crimson, best bet set entityresolver resolves dtd without reading separate file.

import org.xml.sax.*; import java.io.*;  public class noopentityresolver implements entityresolver {   public inputsource resolveentity(string publicid, string systemid) {     return new inputsource(new stringbufferinputstream(""));   } } 

then in builder...

builder.setentityresolver(new noopentityresolver()); 

there downside approach. entities in document resolved empty string, , disappear. if document has entities, need setexpandentities(false) code , ensure entityresolver suppresses doctype.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -