I need to fetch a given web page, and then convert the HTML tag to XML tag, and from these XML tag i need to build a tree. how can i do that ?? please show me some good link or tutorial based on these, btw i am using the java language.
Thanks.
I need to fetch a given web page, and then convert the HTML tag to XML tag, and from these XML tag i need to build a tree. how can i do that ?? please show me some good link or tutorial based on these, btw i am using the java language.
Thanks.
HttpClient to get the data. HtmlCleaner to turn it into XML.
Both have tutorials.
Take a look at Apache http://hc.apache.org/httpcomponents-client-ga/ and http://htmlcleaner.sourceforge.net/