Asked  2 Years ago    Answers:  5   Viewed   73 times

For some reason this code will not let me into the website when I use the correct login information. The System.out.println posts the code of the login page, indicating my code did not work. Can someone tell me what I'm forgetting or what's wrong with it?

public void connect() {

    try {
        Connection.Response loginForm = Jsoup.connect("https://www.capitaliq.com/CIQDotNet/Login.aspx/login.php")
                .method(Connection.Method.GET)
                .execute();

        org.jsoup.nodes.Document document = Jsoup.connect("https://www.capitaliq.com/CIQDotNet/Login.aspx/authentication.php")
                .data("cookieexists", "false")
                .data("username", "myUsername")
                .data("password", "myPassword")
                .cookies(loginForm.cookies())
                .post();
        System.out.println(document);
    } catch (IOException ex) {
        Logger.getLogger(WebCrawler.class.getName()).log(Level.SEVERE, null, ex);
    }
}

 Answers

5

Besides the username, password and the cookies, the site requeires two additional values for the login - VIEWSTATE and EVENTVALIDATION.
You can get them from the response of the first Get request, like this -

Document doc = loginForm.parse();
Element e = doc.select("input[id=__VIEWSTATE]").first();
String viewState = e.attr("value");
e = doc.select("input[id=__EVENTVALIDATION]").first();
String eventValidation = e.attr("value");

And add it after the password (the order doesn't really matter) -

org.jsoup.nodes.Document document = (org.jsoup.nodes.Document) Jsoup.connect("https://www.capitaliq.com/CIQDotNet/Login.aspx/authentication.php").userAgent("Mozilla/5.0")               
            .data("myLogin$myUsername", "MyUsername")
            .data("myLogin$myPassword, "MyPassword")
            .data("myLogin$myLoginButton.x", "22")                   
            .data("myLogin$myLoginButton.y", "8")
            .data("__VIEWSTATE", viewState)
            .data("__EVENTVALIDATION", eventValidation)
            .cookies(loginForm.cookies())
            .post();

I would also add the userAgent field to both requests - some sites test it and send different pages to different clients, so if you would like to get the same response as you get with your browser, add to the requests .userAgent("Mozilla/5.0") (or whatever browser you're using).

Edit
The userName's field name is myLogin$myUsername, the password is myLogin$myPassword and the Post request also contains data about the login button. Ican't test it, because I don't have user at that site, but I believe it will work. Hope this solves your problem.

EDIT 2
To enable the remember me field during login, add this line to the post request:

.data("myLogin$myEnableAutoLogin", "on")
Thursday, October 20, 2022
3

I ended up using Python with Selenium Firefox web driver. Since I'm using a real browser, I can do everything FF can.

Tuesday, November 29, 2022
 
4

It's not very much different to have a nested formarray. Basically you just duplicate the code you have... with nested array :) So here's a sample:

myForm: FormGroup;

constructor(private fb: FormBuilder) {
  this.myForm = this.fb.group({
    // you can also set initial formgroup inside if you like
    companies: this.fb.array([])
  })
}

addNewCompany() {
  let control = <FormArray>this.myForm.controls.companies;
  control.push(
    this.fb.group({
      company: [''],
      // nested form array, you could also add a form group initially
      projects: this.fb.array([])
    })
  )
}

deleteCompany(index) {
  let control = <FormArray>this.myForm.controls.companies;
  control.removeAt(index)
}

So that is the add and delete for the outermost form array, so adding and removing formgroups to the nested form array is just duplicating the code. Where from the template we pass the current formgroup to which array you want to add (in this case) a new project/delete a project.

addNewProject(control) {
  control.push(
    this.fb.group({
      projectName: ['']
  }))
}

deleteProject(control, index) {
  control.removeAt(index)
}

And the template in the same manner, you iterate your outer formarray, and then inside that iterate your inner form array:

<form [formGroup]="myForm">
  <div formArrayName="companies">
    <div *ngFor="let comp of myForm.get('companies').controls; let i=index">
    <h3>COMPANY {{i+1}}: </h3>
    <div [formGroupName]="i">
      <input formControlName="company" />
      <button (click)="deleteCompany(i)">
         Delete Company
      </button>
      <div formArrayName="projects">
        <div *ngFor="let project of comp.get('projects').controls; let j=index">
          <h4>PROJECT {{j+1}}</h4>
          <div [formGroupName]="j">
            <input formControlName="projectName" />
            <button (click)="deleteProject(comp.controls.projects, j)">
              Delete Project
            </button>
          </div>
        </div>
        <button (click)="addNewProject(comp.controls.projects)">
          Add new Project
        </button>
      </div>
    </div>
  </div>
</div>

DEMO

EDIT:

To set values to your form once you have data, you can call the following methods that will iterate your data and set the values to your form. In this case data looks like:

data = {
  companies: [
    {
      company: "example comany",
      projects: [
        {
          projectName: "example project",
        }
      ]
    }
  ]
}

We call setCompanies to set values to our form:

setCompanies() {
  let control = <FormArray>this.myForm.controls.companies;
  this.data.companies.forEach(x => {
    control.push(this.fb.group({ 
      company: x.company, 
      projects: this.setProjects(x) }))
  })
}

setProjects(x) {
  let arr = new FormArray([])
  x.projects.forEach(y => {
    arr.push(this.fb.group({ 
      projectName: y.projectName 
    }))
  })
  return arr;
}
Monday, November 14, 2022
 
cihan
 
3

What you see in your web browser is not what Jsoup sees. Disable JavaScript and refresh page to get what Jsoup gets OR press CTRL+U ("Show source", not "Inspect"!) in your browser to see original HTML document before JavaScript modifications. When you use your browser's debugger it shows final document after modifications so it's not not suitable for your needs.

It seems like whole "UPCOMING EVENTS" section is dynamically loaded by JavaScript. Even more, this section is asynchronously loaded with AJAX. You can use your browsers debugger (Network tab) to see every possible request and response.

I found it but unfortunately all the data you need is returned as JSON so you're going to need another library to parse JSON.

That's not the end of the bad news and this case is more complicated. You could make direct request for the data: http://www.bellator.com/feeds/ent_m152_bellator/V1_1_0/d10a728c-547e-4a6f-b140-7eecb67cff6b but the URL seems random and few of these URLs (one per upcoming event?) are included inside JavaScript code in HTML.

My approach would be to get the URLs of these feeds with something like:


        List<String> feedUrls = new ArrayList<>();

        //select all the scripts
        Elements scripts = document.select("script");
        for(Element script: scripts){
            if(script.text().contains("http://www.bellator.com/feeds/")){
                // here use regexp to get all URLs from script.text() and add them to feedUrls

            }
        }

        for(String feedUrl : feedUrls){
            // iterate over feed URLs, download each of them
            String json = Jsoup.connect(feedUrl).ignoreContentType(true).get().body().toString();
            // here use JSON parsing library to get the data you need

        }

ALTERNATIVE approach would be to stop using Jsoup because of its limitations and use Selenium Webdriver as it supports dynamic page modifications by JavaScript so you'd get the HTML of the final result - exactly what you see in web browser and Inspector.

Saturday, September 10, 2022
4

for some odd reason (perhaps ui-router versioning) the controllerAs is not working.

Diana´s answer is valid, but it doesn´t use the controlleras syntax. If you still want to use it, change the ui-router setting to:

.state('login', {
        url: "/login",
        templateUrl: "login-form.html",
        controller: 'LoginCtrl as login',
    })

That should do the trick ;)

Check it out: http://plnkr.co/edit/OCqNexVeFFxt4kUuEVc1?p=preview

Saturday, November 19, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 

Browse Other Code Languages