Burp Suite User Forum

Create new post

SAXParser Dependency Delimma

A | Last updated: May 19, 2017 08:02AM UTC

Hi guys, I'm in the process of writing a Burp extension in Python, and one of the dependency libraries makes use of the "xml.etree.cElementTree" module to parse XML markup. The problem is that any call to the "xml.etree.cElementTree.parse" function causes Jython to raise the following exception "java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser". At first glance, the problem appears to be that the "org.apache.xerces.parsers.SAXParser" class is missing. However, I can import it directly within my extension using `import org.apache.xerces.parsers.SAXParser` which doesn't raise any "ImportError". Tracking down the root cause, it seems that this Jython internal script "/Lib/xml/parsers/expat.py" is responsible for loading the SAXParser class. It attempts to do so by executing the following try-except clause (at line 88): ``` try: self._reader = XMLReaderFactory.createXMLReader(_mangled_xerces_parser_name) except: self._reader = XMLReaderFactory.createXMLReader(_xerces_parser_name) ``` Where the variables "_mangled_xerces_parser_name" and "_xerces_parser_name" are set to "org.python.apache.xerces.parsers.SAXParser" and "org.apache.xerces.parsers.SAXParser" respectively. Inspecting the Jython standalone jar reveals it does indeed contain a "SAXParser.class" at "org/python/apache/xerces/parsers/SAXParser.class". So I can't figure out why the class couldn't be loaded successfully. I tried unzipping the "xercesImpl.jar" file which contains the required class into the same directory as the extension, but that doesn't help at all. Any suggestions on how to solve this gracefully? Thanks, AE

Adam, PortSwigger Agent | Last updated: May 22, 2017 09:57AM UTC

Hi AE, Is this problem still ongoing following the release of 1.7.23? More specifically, if you add the jar to the directory specified in the "Java Environment", does this alleviate the problem? Thanks, Adam

PortSwigger Agent | Last updated: May 31, 2017 06:09AM UTC

Hi AE, Thanks for getting back to us. We have been able to reproduce your issue with the latest version of Burp. We will investigate further and get back to you.

Burp User | Last updated: Jun 01, 2017 07:29PM UTC

Hi Adam, Yes, this is still a problem with the latest Burp Suite release (1.7.23). I did try to add the jar file into the "Java Environment" directory, but unfortunately that didn't help. You should be able to reproduce this problem by adding these two lines of code to any Burp extension written in Python (or just add them to "test.py" and import the file): ``` import xml.etree.ElementTree as ET ET.fromstring('<test></test>') ``` Thanks, AE

PortSwigger Agent | Last updated: Jun 02, 2017 09:50AM UTC

On further investigation, this is related to a long-standing bug in Jython: - http://bugs.jython.org/issue1127 Unfortunately, we can't easily work around this in Burp Extender.

Burp User | Last updated: Jun 02, 2017 11:04AM UTC

In the meantime, you can use Java's XML parsing, with a basic (tested, working) demo available at https://gist.github.com/ahri/f89db790fdac65a524e7133941ed2110 - hope that helps!

Burp User | Last updated: Jun 07, 2017 03:02AM UTC

Thanks for taking the time to investigate this. The only workaround I could find so far was to load Burp with the JAR file "xercesImpl.jar" using the command `java -classpath xercesImpl.jar;burpsuite_pro.jar burp.StartBurp`. But that means the extension wouldn't normally work when loaded directly from the EXE Burp executable. So this is by no means an ideal workaround. And I cannot really use Java's XML parsing since that the problem is actually with a third-party library that comes pre-installed. Not sure if it would be technically possible to monkey-patch the entire thing within my extension's code by overriding `ET.fromstring` and similar functions at runtime with equivalent ones that make use of Java's XML parsing instead. Looking forward to a better solution for this issue.

PortSwigger Agent | Last updated: Jun 07, 2017 07:19AM UTC

What's the third-party library that's causing this issue? Monkey patching ET.fromstring sounds like a pretty smart workaround, I'd be interested to hear how you get on. You can also hack the class path within Jython using these instructions: - http://www.jython.org/jythonbook/en/1.0/appendixB.html#working-with-classpath I think adding jython-standalone.jar will do the trick; xercesImpl.jar shouldn't be required.

Burp User | Last updated: Nov 29, 2017 01:10AM UTC

Hi, Have been able to find a better solution for this problem?

PortSwigger Agent | Last updated: Nov 29, 2017 09:08AM UTC

Hi Sam, Unfortunately, we haven't identified a better solution. Until Jython address this issue, there's not much we can do: - http://bugs.jython.org/issue1127

Burp User | Last updated: May 30, 2018 01:04PM UTC

Here seems to be a more detailed discussion on the issue: http://bugs.jython.org/issue1537

PortSwigger Agent | Last updated: May 30, 2018 02:53PM UTC

Hi, Thanks for doing a detailed investigation and suggesting a workaround. That's right, we load Jython using a custom class loader. Jython should use that throughout, because of class loader inheritance, but it uses a different class loader, resulting in this bug. The issue I linked describes that. Setting the context class loader for all Burp's threads seems very intrusive. Given that this is a corner case and there are workarounds available, it's unlikely we'd take that approach. You can get the path to the Jython jar by looking in sys.path.

Burp User | Last updated: Jun 19, 2018 03:14PM UTC

For anyone actually trying to use a jython classpath hack - it won't work starting from java 9.

Burp User | Last updated: Jun 19, 2018 05:16PM UTC

Paul, could you please elaborate on how http://bugs.jython.org/issue1127 is responsible for the bug here? The jython.jar file contains the org.python.apache.xerces.parsers.SAXParser class (and it can be imported from python code using an import statement). The problem here is because you dynamically load jython jar using (most probably) a custom UrlClassLoader which is separate to the system classloader. This is why when python code ultimately calls XMLReaderFactory.createXMLReader('org.python.apache.xerces.parsers.SAXParser')), the default classloader is used, which has no idea about any classes in jython's jar. I don't personally think that this is a jython bug. This might be fixed in burp: when you create a ClassLoader for the jython jar, you should set it as current classloader for all burp's threads via setContextClassLoader. See an example in the following gist, with a minimal version of the code that reproduces the java.lang.ClassNotFoundException: https://gist.github.com/ngo/2e694fe096273cf928424fc6f19938ff

Burp User | Last updated: Jun 19, 2018 05:49PM UTC

And the workaround from inside a python extension would be as follows: classloader = URLClassLoader([URL("file://" + os.getcwd()+ "/xercesImpl-2.11.0.jar")], JavaThread.currentThread().getContextClassLoader()) JavaThread.currentThread().setContextClassLoader(classloader); Unfortunately it requires bundling xerces jar with the extension. A more elegant way would be to obtain a path to jython jar from burp and use that, but unfortunately there is no API for an extension to get that information.

Burp User | Last updated: Jun 20, 2018 10:39PM UTC

Paul, thank you for your answer. I missed the classloader inheritance part. After testing this scenario with a sample pure-java jar that was able to succesfully use inherited classloader when dynamically loaded from a main program, I am now confident that this is a jython bug. I am still not convinced though that issue 1127 describes the exact problem that we are seing, because it is related to the way jython startup script was setting the jython.jar classpath using a bootclasspath (which was btw since fixed). Will have to dig jython sources a bit...

Burp User | Last updated: Jun 20, 2018 11:47PM UTC

After digging a little bit further here's what I found. Jython actually uses the parent classloader, as can be demonstrated using the following snippet: URL[] classUrls = {new URL("file:///path/to/jython-standalone-2.7.0.jar")}; URLClassLoader urlcl = new URLClassLoader(classUrls, oldcl); Class interpreter = Class.forName("org.python.util.PythonInterpreter", true, urlcl); Method exec = interpreter.getDeclaredMethod("exec", String.class); Object instance = interpreter.newInstance(); exec.invoke(instance,"from java.lang import Class"); exec.invoke(instance,"print Class.forName('org.python.apache.xerces.parsers.SAXParser')"); This does not produce an error. But XMLReaderFactory.createXMLReader does not actually use class.forName. Here's its implementation: public static XMLReader createXMLReader (String className) throws SAXException { return loadClass (SecuritySupport.getClassLoader(), className); } SecuritySupport.getClassLoader has the following implementation: public static ClassLoader getContextClassLoader() { return AccessController.doPrivileged((PrivilegedAction<ClassLoader>) () -> { ClassLoader cl = Thread.currentThread().getContextClassLoader(); if (cl == null) cl = ClassLoader.getSystemClassLoader(); return cl; }); } As you can see, SAX for some reason explicitly uses thread's contextClassLoader or system classloader instead of the parent classloader. So, the behavior we're seing is actually a quirk of SAX, not jython.

Burp User | Last updated: Jun 20, 2018 11:58PM UTC

I was able to confirm my previous statements in an isolated example that does not include jython (see [1]). In that example, there is a jar that calls XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser"). This jar also embeds the whole xercesImpl.jar. It is then loaded dynamically by a main program and the createXMLReader call fails. [1] https://gist.github.com/ngo/9c832119868ff8009b45e783babd2616

PortSwigger Agent | Last updated: Jun 21, 2018 09:00AM UTC

That's really interesting! So we had blamed Jython in error. I don't think there's anything we can reasonably include in Burp to workaround this. While there are some potential hacky fixes, there's too much risk these will break other things. If you raise an issue with Xerces, I'd be interested to hear how they respond. In the meantime, including a hacky fix within your extension is the only way.

Burp User | Last updated: Jun 21, 2018 04:24PM UTC

Considering the API that jython is using is marked as deprecated, I doubt Xerces devs would fix that. I've raised an issue in jython (http://bugs.jython.org/issue2693), proposing a switch to SAXParserFactory.newInstance, which supports passing your own classloader as a parameter. I think that might be the best solution.

You must be an existing, logged-in customer to reply to a thread. Please email us for additional support.