Skip to content Skip to sidebar Skip to footer

Html Parsing With Jsoup

I am trying to parse the html of the following URL: http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-fall-2002/ to obtain the text of the '< p >'

Solution 1:

I don't know anything about JSoup, but it seems like if you wanted the instructors name you could access it with something like:

Element instructor = doc.select("div.chpstaff div p");

Solution 2:

may be u already solved but i worked on it so cant resist to submit

import java.io.IOException;
import java.util.logging.*;
import org.jsoup.*;
import org.jsoup.nodes.*;
import org.jsoup.select.*;
publicclassJavaApplication17 {

publicstaticvoidmain(String[] args) {

try {
   Stringurl="http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-   fall-2002/";
  Documentdoc= Jsoup.connect(url).get();
  Elementsparagraphs= doc.select("p");
  for(Element p : paragraphs)
    System.out.println(p.text());

} 
catch (IOException ex) {
  Logger.getLogger(JavaApplication17.class.getName())
        .log(Level.SEVERE, null, ex);
   }
  }
}

is it what u meant?

Solution 3:

Here's a short example:

// Connect to the website and parse it into a document
Document doc = Jsoup.connect("http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-fall-2002/").get();

// Select all elements you need (se below for documentation)
Elements elements = doc.select("div[class=chpstaff] p");

// Get the text of the first element
String instructor = elements.first().text();

// eg. print the result
System.out.println(instructor);

Take a look at the documentation of the jsoup selector api here: Jsoup Codebook Its not very difficult to use but very powerful.

Solution 4:

Here is a code

Document document = Jsoup.connect("http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-fall-2002/").get();

        Elements elements = document.select("p");
        System.out.println(elements.html());

You can select all tags using Selector property of Jsoup. It will return the text and tags of

.

Solution 5:

        Elements ele=doc.select("p");
      ' String text=ele.text();
        System.out.println(text);

Try this I think it will work

Post a Comment for "Html Parsing With Jsoup"