Over the years I’ve seen varying opinions on whether you should use Xpath locators as anything but a last resort. Back when I started I adopted the standpoint that I would avoid most usage of Xpath, but in situations where it really wasn’t practical to have IDs added I would use it as a last resort.
These days I don’t mind Xpath being used in situations where it makes sense. Things like fake tables (made of divs and spans), and where multiple copies of the same HTML structure exist, and adding identical IDs to those would not be valid HTML.
A few years ago, I would have expected using Xpath locators to lead to extremely fragile tests, but I’ve learned a few of the nuances of the language that lead to strong and readable locators. I believe they’re a great asset to testing web pages, as long as they’re used in a responsible and sensible manner.
Too many times when reading through WebDriver questions on Stack Overflow, I see this kind of thing:
This is the kind of thing you should never do, and never suggest anyone else do. If you write an Xpath like this and have to fix it months down the line, you’re going to have a difficult job remembering what this was doing. Even worse if a team member has to fix it, they won’t have a clue!
Just by looking at that locator, try to figure out what it’s locating. It’s a link contained within a hugely complicated number of other elements. Who really knows what it does though.
So how do you do Xpath better?
The first thing to know is that you don’t have to navigate through every parent element on the page before getting to the one you want. Xpaths starting with “//” are relative and allow you to start looking for an element anywhere on the page, contained within any number of other elements. “//” starts based on the current context, and all driver.findElement requests begin at the root of the document, so using a relative Xpath request means “Find this element anywhere on the page”.
Let’s take a simplified example; a page that shows a couple of products on it with their names and prices. In this example we’ll assume that we know the names of the products that we’re expecting to find on the page, and that we want to verify that their costs are as we expect.
Let’s take a look at the structure of the page we’ve got. We don’t have any IDs to work with, so we need to get a little more creative. Where do we start? We know we need to locate the price for a particular item and check that it’s what we expect. If we just checked that the price was somewhere on the page, it would not be specific enough for our case. It would only prove that that price existed somewhere on the page, and not necessarily near the product we expect it to be for.
So let’s locate our product first, and see where we need to go from there.
We can locate our product fairly simply, by using the text we know it should have, and the element we know it exists inside:
That’s a fairly simple Xpath, and it’s obvious what it does by looking at it. It finds a P element with the text ‘Product 2’.
I would encourage you to follow along with this using xpathtester.com . This is a website I use all the time when debugging Xpath selector issues. Copy and paste the Xpath into the Xpath field, and the HTML from above into the XML field. You’ll be able to see what Xpath sees at each step of the process.
Next we’ll make use of Xpath’s powerful axes features. These allow you to navigate the document based on where you currently are in it. We’re currently at a P tag contained within a DIV that contains additional information we’re interested in checking. The DIV is the parent tag of the P, so we can use the parent axis to travel back up the document to the DIV.
/.. is a shortcut for using the parent axis, so we’ll use that here. We now have the DIV that contains all the product information we’re interested in and we can now locate only elements inside that DIV and nowhere else. We know that any information we collect now is related directly to the product with the name “Product 2”.
Looking at the parent DIV in the HTML above, we can see that the price is located inside a P tag with the class “item-price”. Let’s add that to our Xpath:
This Xpath is slightly more complicated now, but still fairly readable. We get a P element with the text “Product 2”, go back to its parent, and then inside that parent we go foward into a P element with the class “item-price”. If this were to break and someone else in your team had to fix it, that would give them vital clues about what exactly it is that’s broken. The code is failing to find the item-price, and they need to fix the Xpath so that it can find the price again.
It’s descriptive of what it’s doing, and that is an incredibly important aspect of making good tests. People need to be able to quickly understand what your code is doing. Test automation projects live and die based on how easily other members of the team can understand and modify your code when necessary. If nobody can maintain the code all value in the tests are lost the moment something breaks.
This article has been more about the philosophy behind writing understandable Xpaths, rather than teaching the basics of how to write them. If you’re interested in understanding more about Xpath there’s a few decent resources about them out there, including this one over at guru99 that deals specifically with its use in Selenium. WebDriver.