Viewed   78 times

For some reason this code will not let me into the website when I use the correct login information. The System.out.println posts the code of the login page, indicating my code did not work. Can someone tell me what I'm forgetting or what's wrong with it?

public void connect() {

    try {
        Connection.Response loginForm = Jsoup.connect("")

        org.jsoup.nodes.Document document = Jsoup.connect("")
                .data("cookieexists", "false")
                .data("username", "myUsername")
                .data("password", "myPassword")
    } catch (IOException ex) {
        Logger.getLogger(WebCrawler.class.getName()).log(Level.SEVERE, null, ex);



Besides the username, password and the cookies, the site requeires two additional values for the login - VIEWSTATE and EVENTVALIDATION.
You can get them from the response of the first Get request, like this -

Document doc = loginForm.parse();
Element e ="input[id=__VIEWSTATE]").first();
String viewState = e.attr("value");
e ="input[id=__EVENTVALIDATION]").first();
String eventValidation = e.attr("value");

And add it after the password (the order doesn't really matter) -

org.jsoup.nodes.Document document = (org.jsoup.nodes.Document) Jsoup.connect("").userAgent("Mozilla/5.0")               
            .data("myLogin$myUsername", "MyUsername")
            .data("myLogin$myPassword, "MyPassword")
            .data("myLogin$myLoginButton.x", "22")                   
            .data("myLogin$myLoginButton.y", "8")
            .data("__VIEWSTATE", viewState)
            .data("__EVENTVALIDATION", eventValidation)

I would also add the userAgent field to both requests - some sites test it and send different pages to different clients, so if you would like to get the same response as you get with your browser, add to the requests .userAgent("Mozilla/5.0") (or whatever browser you're using).

The userName's field name is myLogin$myUsername, the password is myLogin$myPassword and the Post request also contains data about the login button. Ican't test it, because I don't have user at that site, but I believe it will work. Hope this solves your problem.

To enable the remember me field during login, add this line to the post request:

.data("myLogin$myEnableAutoLogin", "on")
Thursday, October 20, 2022

What you see in your web browser is not what Jsoup sees. Disable JavaScript and refresh page to get what Jsoup gets OR press CTRL+U ("Show source", not "Inspect"!) in your browser to see original HTML document before JavaScript modifications. When you use your browser's debugger it shows final document after modifications so it's not not suitable for your needs.

It seems like whole "UPCOMING EVENTS" section is dynamically loaded by JavaScript. Even more, this section is asynchronously loaded with AJAX. You can use your browsers debugger (Network tab) to see every possible request and response.

I found it but unfortunately all the data you need is returned as JSON so you're going to need another library to parse JSON.

That's not the end of the bad news and this case is more complicated. You could make direct request for the data: but the URL seems random and few of these URLs (one per upcoming event?) are included inside JavaScript code in HTML.

My approach would be to get the URLs of these feeds with something like:

        List<String> feedUrls = new ArrayList<>();

        //select all the scripts
        Elements scripts ="script");
        for(Element script: scripts){
                // here use regexp to get all URLs from script.text() and add them to feedUrls


        for(String feedUrl : feedUrls){
            // iterate over feed URLs, download each of them
            String json = Jsoup.connect(feedUrl).ignoreContentType(true).get().body().toString();
            // here use JSON parsing library to get the data you need


ALTERNATIVE approach would be to stop using Jsoup because of its limitations and use Selenium Webdriver as it supports dynamic page modifications by JavaScript so you'd get the HTML of the final result - exactly what you see in web browser and Inspector.

Saturday, September 10, 2022

You need to read form before posting! You are missing param subbera=Login.

public static void main(String[] args) throws Exception {

    Connection.Response loginForm = Jsoup.connect("")

    Document document = Jsoup.connect("")
            .data("cookieexists", "false")
            .data("name", "username")
            .data("password", "pass")
            .data("subbera", "Login")

Saturday, October 8, 2022

Just add a CAPTCHA test for cases when there have been failed login attempts for a given user. This is what lots of websites currently do (all popular email services for instance) and is much less invasive.

Yet it completely thwarts brute force attacks, as long as the attacker cannot break your CAPTCHA.

Wednesday, November 9, 2022

Try this CSS selector:

h2#1 ~ *:not(h2#2 ~ *):not(h2#2)



h2#1 ~ *       /* Select any node preceded by h2#1 ... */
:not(h2#2 ~ *) /* ... and not preceded by h2#2 ... */
:not(h2#2)     /* ... and exclude h2#2 itself ! */

Tested on Jsoup 1.8.3

Sunday, August 7, 2022
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :