-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Labels
bugA confirmed bug, that we should fixA confirmed bug, that we should fixfixedAn {bug|improvement} that has been {fixed|implemented}An {bug|improvement} that has been {fixed|implemented}
Milestone
Description
Hello,
In version 1.9.2, processing instructions are not correctly parsed any more.
Here is sample code for reproducing the issue.
package jsoupbug;
import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Node;
import org.jsoup.parser.Parser;
public class JsoupBug {
private static final String XML = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<?myProcessingInstruction My Processing instruction.?>";
public static void main(String[] args) {
Document document = Jsoup.parse(XML, "", Parser.xmlParser());
document.outputSettings().prettyPrint(false);
List<Node> nodes = document.childNodes();
Node node = nodes.get(2);
String outerHtml = node.outerHtml();
System.out.println(outerHtml);
}
}
When I correctly understand the spec (https://www.w3.org/TR/REC-xml/#sec-pi) spaces are valid characters for processing instructions, but Jsoup messes things up.
With version 1.9.2 this prints:
<?myprocessingInstruction my="" processing="" instruction.=""?>
However in 1.9.1 the behavior is as I would expect:
<?myProcessingInstruction My Processing instruction.?>
Metadata
Metadata
Assignees
Labels
bugA confirmed bug, that we should fixA confirmed bug, that we should fixfixedAn {bug|improvement} that has been {fixed|implemented}An {bug|improvement} that has been {fixed|implemented}