0

I need to fetch a given web page, and then convert the HTML tag to XML tag, and from these XML tag i need to build a tree. how can i do that ?? please show me some good link or tutorial based on these, btw i am using the java language.

Thanks.

1
  • Have you tried to write any code yet at all? HTML often contains invalid hierarchical XML content. This is not going to be an easy task.
    – Mike Atlas
    Commented Apr 29, 2011 at 18:54

2 Answers 2

1

HttpClient to get the data. HtmlCleaner to turn it into XML.

Both have tutorials.

0
0

Take a look at Apache http://hc.apache.org/httpcomponents-client-ga/ and http://htmlcleaner.sourceforge.net/

Not the answer you're looking for? Browse other questions tagged or ask your own question.