Tag Archives: webdriver

Harharhar said the Santa when capturing browser test metrics with Webdriver and LittleMobProxy

In the modern age of big data and all that, it is trendy to capture as much data we can. So this is an attempt at capturing data on a set of web browsing tests I run on Selenium WebDriver. This is with Java using Selenium Webdriver and LittleMobProxy.

What I do here is configure an instance of LittleMobProxy to capture the traffic to/from a webserver in our tests. The captured data is written to a HAR file (HTTP archive file), the HAR file is parsed and the contents are used for whatever you like. In this case I dump them to InfluxDB and show some graphs on the generated sessions using Grafana.

This can be useful to see how much bandwidth your website uses, how many requests end up being generated, what elements are slowest to load, how your server caching configurations affect everything, and so on. I have used it to provide data for overall network performance analysis by simulating a set of browsers and capturing the data on their sessions.

First up, start the LittleMobProxy instance, create a WebDriver instance, and configure the WebDriver instance to use the LittleMobProxy instance:

    // start the proxy
    BrowserMobProxy proxy = new BrowserMobProxyServer();
    proxy.start(0);
    // get the Selenium proxy object
    Proxy seleniumProxy = ClientUtil.createSeleniumProxy(proxy);

    // configure it as a desired capability
    DesiredCapabilities capabilities = new DesiredCapabilities();
    capabilities.setCapability(CapabilityType.PROXY, seleniumProxy);

    driver = new ChromeDriver(capabilities);
    proxy.newHar("my_website_data");

Then we run some scripts on our website. The following is just a simple example as I do not wish to put here bots for browsing common website. I have prototyped some on browsing various videos on YouTube and on browsing different news portals. Generally this might be against the terms of service on public websites, so either use your own service you are testing (with a test service instance) or download a Wikipedia dump or something similar to run your tests on. Example code:

  public List<WebElement> listArticles() {
    List<WebElement> elements = driver.findElements(By.className("news"));
    List<WebElement> invisibles = new ArrayList<>();
    for (WebElement element : elements) {
      if (!element.isDisplayed()) {
        System.out.println("Not displayed:"+element);
        invisibles.add(element);
      }
    }
    elements.removeAll(invisibles);
    List<WebElement> articles = new ArrayList<>();
    List<String> descs = new ArrayList<>();
    for (WebElement element : elements) {
      List<WebElement> links = element.findElements(By.tagName("a"));
      for (WebElement link : links) {
        //we only take long links as this website has some "features" in the content portal causing pointless short links.
        //This also removes the "share" buttons for facebook, twitter, etc. which we do not want to hit
        //A better alternative might be to avoid links leading out of the domain 
        //(if you can figure them out..)
        //this is likely true for the ads as well..
        if (link.getText().length() > 20) {
          descs.add(link.getText());
          articles.add(link);
        }
      }
    }
    return articles;
  }

  public void openRandomArticle() throws Exception {
    List<WebElement> articles = listArticles();
    //sometimes our randomized user might hit a seemingly dead end on the article tree, 
    //in which case we just to the news portal main page ("base" here)
    if (articles.size() == 0) {
      driver.get(base);
      Thread.sleep(2000);
      articles = listArticles();
    }
    //this is a random choice of the previously filtered article list
    WebElement link = TestUtils.oneOf(articles);
    Actions actions = new Actions(driver);
    actions.moveToElement(link);
    actions.click();
    actions.perform();

    Har har = proxy.getHar();
    har.writeTo(new File(“my_website.har”));
    //if we just want to print it, we can do this..
    WebTestUtils.printHar(har);
    //or to drop stats in a database, do something like this
    WebTestUtils.influxHar(har);
  }

The code to access the data in the HAR file:

  public static void printHar(Har har) {
    HarLog log = har.getLog();
    List<HarPage> pages = log.getPages();
    for (HarPage page : pages) {
      String id = page.getId();
      String title = page.getTitle();
      Date started = page.getStartedDateTime();
      System.out.println("page: id=" + id + ", title=" + title + ", started=" + started);
    }
    List<HarEntry> entries = log.getEntries();
    for (HarEntry entry : entries) {
      String pageref = entry.getPageref();
      long time = entry.getTime();

      HarRequest request = entry.getRequest();
      long requestBodySize = request.getBodySize();
      long requestHeadersSize = request.getHeadersSize();
      String url = request.getUrl();

      HarResponse response = entry.getResponse();
      long responseBodySize = response.getBodySize();
      long responseHeadersSize = response.getHeadersSize();
      int status = response.getStatus();

      System.out.println("entry: pageref=" + pageref + ", time=" + time + ", reqBS=" + requestBodySize + ", reqHS=" + requestHeadersSize +
              ", resBS=" + responseBodySize + ", resHS=" + responseHeadersSize + ", status=" + status + ", url="+url);
    }
  }

So we can use this in many ways. Above I have just printed out some of the basic stats. Some example information available can be found on the internet, e.g. https://confluence.atlassian.com/display/KB/Generating+HAR+files+and+Analysing+Web+Requests. In the following I show some simple data from browsing a local newssite, visualized in Grafana using InfluxDB as a backend:

Here is some example code to write some of the HAR stats to InfluxDB:

  public static void influxHar(Har har) {
    HarLog harLog = har.getLog();
    List<HarPage> pages = harLog.getPages();
    for (HarPage page : pages) {
      String id = page.getId();
      String title = page.getTitle();
      Date started = page.getStartedDateTime();
      System.out.println("page: id=" + id + ", title=" + title + ", started=" + started);
    }
    List<HarEntry> entries = harLog.getEntries();
    long now = System.currentTimeMillis();
    int counter = 0;
    for (int i = index ; i < entries.size() ; i++) {
      HarEntry entry = entries.get(i);
      String pageref = entry.getPageref();
      long loadTime = entry.getTime();

      HarRequest request = entry.getRequest();
      if (request == null) {
        log.debug("Null request, skipping HAR entry");
        continue;
      }
      HarResponse response = entry.getResponse();
      if (response == null) {
        log.debug("Null response, skipping HAR entry");
        continue;
      }

      Map<String, Long> data = new HashMap<>();
      data.put("loadtime", loadTime);
      data.put("req_head", request.getHeadersSize());
      data.put("req_body", request.getBodySize());
      data.put("resp_head", response.getHeadersSize());
      data.put("resp_body", response.getBodySize());
      InFlux.store("browser_stat", now, data);
      counter++;
    }
    index += counter;
    System.out.println("count:"+counter);
  }

And the code to write the data into InfluxDB..

  private static InfluxDB db;

  static {
    if (Config.INFLUX_ENABLED) {
      db = InfluxDBFactory.connect(Config.INFLUX_URL, Config.INFLUX_USER, Config.INFLUX_PW);
      db.enableBatch(2000, 1, TimeUnit.SECONDS);
      db.createDatabase(Config.INFLUX_DB);
    }
  }

  public static void store(String name, long time, Map<String, Long> data) {
    if (!Config.INFLUX_ENABLED) return;
    Point.Builder builder = Point.measurement(name)
            .time(time, TimeUnit.MILLISECONDS)
            .tag("tom", name);
    for (String key : data.keySet()) {
      builder.field(key, data.get(key));
    }
    Point point = builder.build();
    log.debug("storing:"+point);
    //you should have enabled batch mode (as shown above) on the driver or this will bottleneck
    db.write(Config.INFLUX_DB, "default", point);
  }

And here are some example data visualized with Grafana for some metrics I collected this way:

Untitled

In the lower line chart, the number of elements loaded on click is shown. This refers to how many HTTP requests are generated per a WebDriver click on the website. The upper line/plot chart shows the minimum, maximum and average load times for different requests/responses. That is, how much time it took for the server to send back the HTTP responses for the clicks.

This shows how the first few page loads generated high number of HTTP requests/responses. After this, the amount goes down and stays quite steady at a lower level.  I assume this is due to the browser having cached much of the static content and not needing to request it all the time. Occasionally our simulated random browser user enters a slightly less explored path on the webpage, causing a small short term spike.

This nicely shows how modern websites end up generating surprisingly large numbers of requests. Also this shows how some requests are pretty slow to respond, so this might be a useful point to investigate for optimizing overall response time.

That’s it. Not too complicated but I find it rather interesting. Also does not require too many modifications to the existing WebDriver tests, just taking into use the proxy component, parsing the HAR file and writing to the database.

remote webdriver over ssh

needed to run some tests for a web application on a remote server. only ssh access was realistic. was using selenium webdriver. wanted to use the real browser, which requires the gui of course. is that possible? yes…

remote system is running ubuntu. first we install vnc on it. it is some kind of virtual remote desktop thing. uu yes

sudo apt-get install x11vnc

sudo apt-get install xvfb

the first one is the vnc server, the second one allows you to create new virtual displays while starting the server over command line.

start the remote server

x11vnc -safer -localhost -nopw -forever -create

safer: all the examples had this so i put it there. doh?

localhost: only allow connections from localhost (where you ssh anyway)

forever: allow connecting to the virtual desktop as many times as you like, so you can shut down your ssh client, restart the ssh connection later, reconnect to the virtual display and check the results and what it is doing. you might need to “nohup” it, i forget.. but try it out.

to only allow one time connection and no more change “forever” to “once”.

then to reconnect to the same display at a later time with another ssh session, do the following

x11vnc -safer -localhost -nopw -forever -display :20

where the only difference is that instead of “create” we now have “display” with the identifier of the display we created previously.

then you need to be able to connect to your virtual desktop. i used tightvnc which is a java-based vnc client. when you run x11vnc it prints out the port number. usually that is 5900 but if that is taken might be something else.

so we need to set up an ssh tunnel for that. in putty you go to settings->connection->ssh->tunnels and set whatever port you like in “source port”, lets pretend that is also 5900. and on the “destination” set “locahost:5900” or whatever your port is. now start tightvnc and set it to connect to “localhost:5900”. that is, this is the source port you set in putty (or whatever you use for your tunnel). confusing with the same ports, right, well can’t make it easy on myself arrrrrr…

then you have a connection (you wish). now you will see at the top your display id such as :20. use this later to re-connect your vnc session if needed. you can also find this out by logging into the system over ssh while using the remote session and typing “w”. the w command lists the tty of the user as the display id.

now you have graphical display on the remote system. so now you just run your webdriver tests as usual. with some luck, you can just leave them running and later come back, connect to the same virtual display id, and see what it has managed to test for you.. whooppee.