- 02 Nov 2021
- 1 Minute to read
- Print
- DarkLight
Techniques and Workarounds
- Updated on 02 Nov 2021
- 1 Minute to read
- Print
- DarkLight
Extracting Data from an XML Snippet
There are times when it is necessary to obtain data from the source record where the data consists of a set of correlated values. This could be a set of attribute value pairs, or any other matched set of data.
Under these circumstances, the results from a query can include an XML element from the record that contains the set of correlated values. This XML snippet can be parsed to extract the desired set of data so that it is represented in a more convenient form for further processing. The following Python function is one example of how a correlated set of data can be extracted from the returned XML element. In other words, it creates and returns Python dictionaries that can then be used for accessing values needed in functions.
Example:
#This function will return two dictionaries as a tuple
#The first dictionary is all of the attributes and their values
#The second dictionary is all of the xml tags and their values
def parse_xml_tags(xml_string):
'''xml_string = '<firstName attr1="test>John</firstName><lastName>Doe</lastName>
will return ({'attr1': 'test'}, {'lastName': 'Doe', 'firstName': 'John'})'''
import re
attribute_value_pairs = dict(re.findall(r'(\w*)="(.*?)"', xml_string))
xml_values = dict(re.findall(r'<(\w*).*?>(.*?)</.*?>', xml_string))
return (attribute_value_pairs, xml_values)
The code does not solve the problem, but suggests a technique that can be used in writing a compound query that can process the query output and generate an output file that is compatible with the tool the end user may wish to use (such as Excel).