How to convert partial XML to hash in Ruby -


i have string has plain text , spaces , carriage returns xml-like tags followed xml tags:

string = "hi there.  <set-topic> initiate </set-topic>  <setprofile>    <key>name</key>    <value>joe</value> </setprofile>   <setprofile>    <key>email</key>    <value>email@hi.com</value> </setprofile>  <get-relations>   <collection>goals</collection>   <value>walk upstairs</value> </get-relations> think?  true?  " 

i want parse similar use nori or nokogiri or ox convert xml hash.

my goal able pull out top level tags keys , know elements, like:

keys = ['setprofile', 'setprofile', 'set-topic', 'get-object']  values[0] = [{name => joe}, {email => email@hi.com}] values[3] = [{collection => goals}, {value => walk up}] 

i have seen several functions true xml of mine partial.

i started going down line of thinking:

parsed = doc.search('*').each_with_object({}) |n, h|    (h[n.name] ||= []) << n.text  end 

i'd along these lines if wanted keys , values variables:

require 'nokogiri'  string = "hi there.  <set-topic> initiate </set-topic>  <setprofile>    <key>name</key>    <value>joe</value> </setprofile>   <setprofile>    <key>email</key>    <value>email@hi.com</value> </setprofile>  <get-relations>   <collection>goals</collection>   <value>walk upstairs</value> </get-relations> think?  true? "  doc = nokogiri::xml('<root>' + string + '</root>', nil, nil, nokogiri::xml::parseoptions::noblanks)  nodes = doc.root.children.reject { |n| n.is_a?(nokogiri::xml::text) }.map { |node|    [     node.name, node.children.map { |c|       [c.name, c.content]     }.to_h   ] } nodes # => [["set-topic", {"text"=>" initiate "}], #     ["setprofile", {"key"=>"name", "value"=>"joe"}], #     ["setprofile", {"key"=>"email", "value"=>"email@hi.com"}], #     ["get-relations", {"collection"=>"goals", "value"=>"walk upstairs"}]] 

from nodes it's possible grab rest of detail:

keys = nodes.map(&:first) # => ["set-topic", "setprofile", "setprofile", "get-relations"]  values = nodes.map(&:last) # => [{"text"=>" initiate "}, #     {"key"=>"name", "value"=>"joe"}, #     {"key"=>"email", "value"=>"email@hi.com"}, #     {"collection"=>"goals", "value"=>"walk upstairs"}]  values[0] # => {"text"=>" initiate "} 

if you'd rather, it's possible pre-process dom , remove top-level text:

doc.root.children.select { |n| n.is_a?(nokogiri::xml::text) }.map(&:remove) doc.to_xml # => "<root><set-topic> initiate </set-topic><setprofile><key>name</key><value>joe</value></setprofile><setprofile><key>email</key><value>email@hi.com</value></setprofile><get-relations><collection>goals</collection><value>walk upstairs</value></get-relations></root>\n" 

that makes easier work xml.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -