Quantcast
Channel: Active questions tagged feed - Stack Overflow
Viewing all articles
Browse latest Browse all 547

lxml.etree.XMLSyntaxError for Korean Charachters

$
0
0

I am trying to parse https://api.lever.co/v0/postings/matchgroup?mode=xml but I am getting the error lxml.etree.XMLSyntaxError: CData section not finished. It seems like the issue is being caused by the data having Korea characters.

import lxml.etree                                                                                             import io                                                                                                     import requestsurl = "https://api.lever.co/v0/postings/matchgroup?mode=xml"r = requests.get(url)f = io.BytesIO(r.content)parser = lxml.etree.XMLParser(recover=False)                                                                                                                                                             tree = lxml.etree.parse(f,parser) # Raises lxml.etree.XMLSyntaxError

I can change recover to True but then some of the entries would be missing.


Viewing all articles
Browse latest Browse all 547

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>